Tree Adjoining Grammar
(Redirected from tree-adjoining grammar)
Jump to navigation
Jump to search
A Tree Adjoining Grammar is a mildly context-sensitive grammar based on a tree rewriting system (where the constituents are not just strings but elementary trees).
- AKA: TAG.
- Context:
- It can be defined as G = (N, T, I, A, S) where:
- [math]\displaystyle{ N }[/math] is a Finite Set of Non-Terminal Symbols.
- [math]\displaystyle{ T }[/math] is a Finite Set of Terminal Symbols.
- $I$ is a Finite Set of initial or non-recursive trees built from N, T and domination predicates.
- [math]\displaystyle{ A }[/math] is a Finite Set of finite auxiliary/recursive trees: one leaf node is a Non-Terminal with same label as the Root Node.
- [math]\displaystyle{ S }[/math] is set of start trees (has to be initial). It is the Sentence Symbol.
- $I$ and [math]\displaystyle{ A }[/math] together are called Elementary Trees.
- It can be a type of Unification Grammar.
- It can (typically) be associated to the early proposal in (Joshi et al., 1975).
- It can range from being a Non-Lexicalized TAG to being a Lexicalized TAG.
- A non-lexicalized TAG can always be converted to a lexicalized TAG (Joshi & Schabes, 1997).
- …
- It can be defined as G = (N, T, I, A, S) where:
- Counter-Example(s):
- See: Natural Language Syntactic Theory, Natural Language Syntactic Parsing, Center Embedding, Grammar Formalism.
References
2016
- (Wikipedia, 2016) ⇒ http://wikipedia.org/wiki/Tree-adjoining_grammar Retrieved:2016-4-9.
- Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free grammars have rules for rewriting symbols as strings of other symbols, tree-adjoining grammars have rules for rewriting the nodes of trees as other trees (see tree (graph theory) and tree (data structure)).
2009
- http://www.let.rug.nl/~vannoord/papers/diss/diss/node59.html
- The TAG formalism can be motivated because it provides a larger domain of locality in which to state linguistic dependencies. In most formalisms, dependencies can be defined between the elements of a rule (i.e. between the nodes of a local tree). In TAG, it is possible to state dependencies between nodes of trees which are further apart, because the basic building blocks of the formalism are trees. For example, the relation between a topicalized constituent and its governor can be stated locally in a TAG, whereas in most other formalisms this relation must be stated with some global mechanism. Furthermore, the adjunction operation provides for the analysis of (at least some) kinds of discontinuous constituency constructions. This is reflected in the formal power of the formalism: some TAGs recognize context-sensitive languages.”
- A Tree Adjoining Grammar consists of a set of elementary trees, divided in initial and auxiliary trees. These trees constitute the basic building blocks of the formalism. Operations of adjunction and substitution are defined which build derived trees from elementary trees. TAG thus constitute a tree-generating system, rather than a string-generating system as context-free grammar. The string language of a TAG is defined as the yields of all the trees derived by a TAG.
- http://www.cis.upenn.edu/~xtag/home.html
- “XTAG is an on-going project to develop a wide-coverage grammar for English using a lexicalized Tree Adjoining Grammar (TAG) formalism. XTAG also serves as an system for the development of TAGs and consists of a parser, an X-windows grammar development interface and a morphological analyzer."
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Tree-adjoining_grammar
- QUOTE: ... One of the most basic properties of Tree Adjoining Grammars (TAGS) is that they have an extended domain of locality (EDOL) (Joshi, 1994). This refers to the fact that the elementary trees that make up the grammar are larger than the corresponding units (the productions) that are used in phrase-structure rule-based frameworks. The claim is that in Lexicalized TAGS (LTAGS) the elementary trees provide a domain of locality large enough to state co-occurrence relationships between a lexical item (the anchor of the elementary tree) and the nodes it imposes constraints on. We will call this the extended domain of locality hypothesis. For example, wh-movement can be expressed locally in a tree that will be anchored by a verb of which an argument is extracted. Consequently, features which are shared by the extraction siteand the wh-word, such as case, do not need to be percolated, but are directly identified in the tree. Figure 1 shows a tree in which the case feature at the extraction site and the wh-word share the same value. (Carroll et al., 1999)
- Misc Anoop Sarkar
- QUOTE: A Supertagger (Srinivas and Joshi, 1999) is a statistical model that can be used to probabilistically assign to the word sequence (which makes up the input string) a sequence of elementary trees from some LTAG grammar. An LTAG parser on the other hand also has to attach the various possible elementary trees for each word in the input string to provide a complete derivation tree for the input (please see (Joshi and Schabes, 1992) for more information on the LTAG formalism and LTAG parsing).
2007
- (Kakkonen, 2007) ⇒ Tuomo Kakkonen. (2007). “Framework and Resources for Natural Language Evaluation." Academic Dissertation. University of Joensuu.
- Tree-adjoining Grammar (TAG), which was first introduced by Joshi et al. (1975), manipulates trees as elementary objects. Different versions of TAGs have been implemented in English (XTAG Research Group 1998), Chinese (Bikel & Chiang 2000), French (Abeillé 1988), German (Neumann 2003), Arabic (Habash & Rambow 2004), Korean (Han et al. 2000) and Hindi (CDAG 2006).
- (Joshi, 2007)
- TAGs (more precisely, languages of TAGs) belong to the class of languages called Mildly Context-Sensitive Languages (MCSL) characterized by:
- 1) polynomial parsing complexity
- 2) Grammars for the languages in this class can characterize a limited set of patterns of nested and crossed dependencies and their combinations.
- 3) languages in this class have the constant growth property (i.e., sentences, if arranged in increasing order of length, grow only by a bounded amount).
- 4) this class properly includes CFLs
- TAGs (more precisely, languages of TAGs) belong to the class of languages called Mildly Context-Sensitive Languages (MCSL) characterized by:
- (Joshi, 2007) ⇒ Aravind K. Joshi. (2007). “Complexity of Dependencies in Natural Language. Presentation, Vancouver, Feb 22. (presentation.ppt)
2005
- (Dickinson, 2005) ⇒ M. Dickinson. (2005). “Tree Adjoining Grammars (TAG). Course Lecture. Linguistics 564 - Computational Grammar Formalisms.
1997
- (Joshi & Schabes, 1997)
- “explain how every CFG can be strongly lexicalized by TAG
- “show that Tree-Adjoining Languages are closed under lexicalization: every TAL has a lexicalized TAG grammar
- Lexicalization has some useful effects:
- finite ambiguity: corresponds to our intuition about NL ambiguities,
- statistical dependencies between words can be captured which can improve parsing accuracy
1975
- (Joshi et al., 1975) ⇒ Aravind K. Joshi, L. S. Levy, and M. Takahashi. (1975). “Tree adjunct grammars.” In: Journal Computer Systems Science, 10(1).