Phrase Chunking Task
A Phrase Chunking Task is a text chunking task where text chunks must be syntactic phrases.
- AKA: PCT, Syntactic Segmentation, Phrase Extraction Task.
- Context:
- Input: Linguistic Expression, typically an entire Linguistic Sentence
- optional: whether Phrase Classification is required
- optional: the Phrasal Category sought, e.g. only noun phrases, Base Noun Phrases, Verb Phrases, etc.
- Output: Tagged String with Tags that demarcate the (possible non-overlapping) Text Chunks (that correspond to Syntactic Phrases)
- optional: The Phrasal Category of each Phrase
- Performance Measure: CoNLL-2000 Evaluation Script
- ...
- It can (typically) identify Syntactic Boundarys
- It can (typically) mark Phrase Segments
- It can (typically) preserve Syntactic Structures
- It can (often) classify Phrase Types
- It can (often) handle Discontinuous Phrases
- ...
- It can range from being a Simple Chunking Task to being a Complex Chunking Task, depending on its chunking complexity level
- It can range from being a Single-Category Task to being a Multi-Category Task, depending on its phrase category scope
- It can range from being a Base Phrase Task to being a Full Phrase Task, depending on its phrase structure depth
- It can range from being a Domain-Specific Task to being a General-Domain Task, depending on its application scope
- ...
- It can be solved by a Phrase Chunking System by means of a Phrase Chunking Algorithm
- It can support a Parsing Task
- It can maintain Analysis History (for performance tracking)
- It can produce Chunking Results (for evaluation)
- It is an easier task than the Parsing Task
- ...
- Input: Linguistic Expression, typically an entire Linguistic Sentence
- Examples:
- General Chunking Tasks, such as:
- (PCT)("He reckons the current account deficit will narrow to only $ 1.8 billion in September.")
⇒ [NP He], [VP reckons], [NP the current account deficit], [VP will narrow], [PP to], [NP only # 1.8 billion], [PP in], [NP September].
- (PCT)("He reckons the current account deficit will narrow to only $ 1.8 billion in September.")
- Specialized Chunking Tasks, such as:
- Base NP Chunking Tasks for noun phrase extraction
- BIO Chunking Tasks for CoNLL-2000 Shared Task
- Verb Phrase Chunking Tasks for predicate identification
- ...
- General Chunking Tasks, such as:
- Counter-Example(s):
- Text Chunking Task where the Chunks are not Syntactic Phrases, as in (Abney, 1989)
- (GCT)("I begin with an intuition: when I read a sentence, I read it a chunk at a time") ⇒ ([I begin] [with an intuition]: [when I read] [a sentence], [I read it] [a chunk] [at a time])
- Word Mention Segmentation Task
- Prosodic Chunking Task
- Text Chunking Task where the Chunks are not Syntactic Phrases, as in (Abney, 1989)
- See: NLP Task, Linguistic Expression, Text Analysis Task, Parsing Task.
References
2002
- (Zhang, Damerau & Johnson, 2002) ⇒ T. Zhang, Fred Damerau, and D Johnson. (2002). “Text chunking based on a generalization of winnow.” In: The Journal of Machine Learning Research
2000
- (Tjong Kim Sang & Buchholz, 2000) ⇒ Erik Tjong Kim Sang, and Sabine Buchholz. (2000). “Introduction to the CoNLL-2000 Shared Task: Chunking.” In: Proceedings of CoNLL-2000.
1995
- (Ramshaw & Marcus, 1995) ⇒ Lance A. Ramshaw, and Mitch P. Marcus. (1995). “Text Chunking Using Transformation-based Learning.” In: Proceedings of the Third ACL Workshop on Very Large Corpora (WVLC 1995).
1989
- (Abney, 1989) ⇒ Steven P. Abney. (1989). “Parsing By Chunks.” In: The MIT Parsing Volume, 1988-89. Center for Cognitive Science, MIT.
- QUOTE: I begin with an intuition: when I read a sentence, I read it a chunk at a time. For example, the previous sentence breaks up something like this:
(1) [I begin] [with an intuition]: [when I read] [a sentence], [I read it] [a chunk] [at a time]
These chunks correspond in some way to prosodic patterns. It appears, for instance, that the strongest stresses in the sentence fall one to a chunk, and pauses are most likely to fall between chunks. … The work I would like to describe is an attempt to give content to these intuitions, and to show that parsing by chunks has distinct processing advantages, advantages that help explain why the human parser might adopt a chunk-by-chunk strategy. … A typical natural language parser processes text in two stages. A tokenizer/morphological analyzer converts a stream of characters into a stream of words, and the parser proper converts a stream of words into a parsed sentence, or a stream of parsed sentences. In a chunking parser, the syntactic analyzer is decomposed into two separate stages, which I call the chunker and the attacher. The chunker converts a stream of words into a stream of chunks, and the attacher converts the stream of chunks into a stream of sentences.
- QUOTE: I begin with an intuition: when I read a sentence, I read it a chunk at a time. For example, the previous sentence breaks up something like this: