Part-of-Speech (POS) Tagging System
Jump to navigation
Jump to search
A Part-of-Speech (POS) Tagging System is an word mention tagging system (that applies a part-of-speech tagging algorithm to solve a POS tagging task.
- AKA: PoS Tagger, POS Tagging System, Grammatical Tagging System, Morphosyntactic Disambiguation System, Tagging System.
- Context:
- It can range from being a Heuristic Part-of-Speech Tagging System to being a Data-Driven Part-of-Speech Tagging System.
- It can range from being a Rule-based Part-of-Speech Tagging System to being a Probabilistic Part-of-Speech Tagging System.
- It can make use of a Tagger Dictionary.
- It can use/produce a Part-of-Speech Tagging Function.
- Example(s):
- Counter-Example(s):
- See: Natural Language Processing System, Word Sense Disambiguation System, Noun, Verb, Pronoun, Adjective, Morphology, Syntax, Lexicon.
References
2017
- (Sammut & Webb, 2017) ⇒ Claude Sammut, and Geoffrey I. Webb. (2017). “Part of Speech Tagging.” In: (Sammut & Webb, 2017). 10.1007/978-1-4899-7687-1_100357
- QUOTE: Part-of-speech tagging (POS tagging) is a process in which each word in a text is assigned its appropriate morphosyntactic category (for example noun-singular, verb-past, adjective, pronoun-personal, and the like). It therefore provides information about both morphology (structure of words) and syntax (structure of sentences). This disambiguation process is determined both by constraints from the lexicon (what are the possible categories for a word?) and by constraints from the context in which the word occurs (which of the possible categories is the right one in this context?). For example, a word like table can be a noun-singular, but also a verb-present (as in I table this motion). This is lexical knowledge. It is the context of the word that should be used to decide which of the possible categories is the correct one.
1992
- (Brill, 1992) ⇒ Eric D. Brill. (1992). “A Simple Rule-based Part of Speech Tagger.” In: Proceedings of the Conference on Applied Natural Language Processing (ANLP 1992).
- QUOTE: One area in which the statistical approach has done particularly well is automatic part of speech tagging, assigning each word in an input sentence its proper part of speech [Church 88, Cutting et al. 92, DeRose 88, Deroualt and Merialdo 86, Garside et al. 87, Jelinek 85, Kupiec 89, Meteer et al. 91]. Stochastic taggers have obtained a high degree of accuracy without performing any syntactic analysis on the input. These stochastic part of speech taggers make use of a Markov model which captures lexical and contextual information. The parameters of the model can be estimated from tagged [Church 88, DeRose 88, Deroualt and Merialdo 86, Garside et al. 87, Meteer et al. 91] or untagged [Cutting et al. 92, Jelinek 85, Kupiec 89] text. Once the parameters of the model are estimated, a sentence can then be automatically tagged by assigning it the tag sequence which is assigned the highest probability by the model. Performance is often enhanced with the aid of various higher level pre- and postprocessing procedures or by manually tuning the model.
A number of rule-based taggers have been built [Klein and Simmons 63, Green and Rubin 71, Hindle 89]. [Klein and Simmons 63] and [Green and Rubin 71] both have error rates substantially higher than state of the art stochastic taggers. [Hindle 89] disambiguates words within a deterministic parser. We wanted to determine whether a simple rule-based tagger without any knowledge of syntax can perform as well as a stochastic tagger, or if part of speech tagging really is a domain to which stochastic techniques are better suited.
- QUOTE: One area in which the statistical approach has done particularly well is automatic part of speech tagging, assigning each word in an input sentence its proper part of speech [Church 88, Cutting et al. 92, DeRose 88, Deroualt and Merialdo 86, Garside et al. 87, Jelinek 85, Kupiec 89, Meteer et al. 91]. Stochastic taggers have obtained a high degree of accuracy without performing any syntactic analysis on the input. These stochastic part of speech taggers make use of a Markov model which captures lexical and contextual information. The parameters of the model can be estimated from tagged [Church 88, DeRose 88, Deroualt and Merialdo 86, Garside et al. 87, Meteer et al. 91] or untagged [Cutting et al. 92, Jelinek 85, Kupiec 89] text. Once the parameters of the model are estimated, a sentence can then be automatically tagged by assigning it the tag sequence which is assigned the highest probability by the model. Performance is often enhanced with the aid of various higher level pre- and postprocessing procedures or by manually tuning the model.