DATR Lexical Knowledge-Representation Language
A DATR Lexical Knowledge-Representation Language is a domain-specific KR language that can represent lexical knowledge as a labeled directed graph where each labeled node represents a word form and has a set of attributes encoded with it.
- …
- Counter-Example(s):
- See: HPSG, Nonmonotonic Inheritance Network, Zdatr, Lexicon.
References
- http://coral.lili.uni-bielefeld.de/DATR/
- Zdatr is a standard DATR implementation in ANSI C based on Gazdar & Evans' DATR RFC 2.0.
2013
- (Wikipedia, 2013) ⇒ http://en.wikipedia.org/wiki/DATR Retrieved:2013-12-8.
- DATR is a language for lexical knowledge representation.[1] The lexical knowledge is encoded in a network of nodes. Each node has a set of attributes encoded with it. A node can represent a word or a word form.
DATR was developed in the late 1980s by Roger Evans and Gerald Gazdar, and used extensively in the 1990s; the standard specification is contained in the Evans and Gazdar RFC, available on the Sussex website (below). DATR has been implemented in a variety of programming languages, and several implementations are available on the internet, including an RFC compliant implementation at the Bielefeld website (below).
DATR is still used for encoding inheritance networks in various linguistic and non-linguistic domains and is under discussion as a standard notation for the representation of lexical information.
- DATR is a language for lexical knowledge representation.[1] The lexical knowledge is encoded in a network of nodes. Each node has a set of attributes encoded with it. A node can represent a word or a word form.
- ↑ Vincent Ooi (B. Y.) (1998). Computer Corpus Lexicography. Edinburgh University Press. pp. 97–100. ISBN 978-0-7486-0815-7. http://books.google.com/books?id=C9Wu7Zz8Ec4C&pg=PA97. Retrieved 20 February 2013.
1996
- (Evans & Gazdar, 1996) ⇒ Roger Evans, and Gerald Gazdar. (1996). “DATR: A Language for Lexical Knowledge Representation.” In: Computational Linguistics Journal, 22(2).
- QUOTE: Much recent research on the design of natural language lexicons has made use of nonmonotonic inheritance networks as originally developed for general knowledge representation purposes in Artificial Intelligence.
DATR
is a simple, spartan language for defining nonmonotonic inheritance networks with path/value equations, one that has been designed specifically for lexical knowledge representation. In keeping with its intendedly minimalist character, it lacks many of the constructs embodied either in general-purpose knowledge representation languages or in contemporary grammar formalisms. The present paper shows that the language is nonetheless sufficiently expressive to represent concisely the structure of lexical information at a variety of levels of linguistic analysis. The paper provides an informal example-based introduction toDATR
and to techniques for its use, including finite-state transduction, the encoding of DAGs and lexical rules, and the representation of ambiguity and alternation. Sample analysis of phenomena such as inflectional syncretism and verbal subcategorization are given that show how the language can be used to squeeze out redundancy from lexical descriptions. …… Our title for this paper is to be taken literally --
DATR
is a language for lexical knowledge representation. It is a kind of programming language, not a theoretical framework for the lexicon (in the way that, say, HPSG is a theoretical framework for syntax). Clearly, the language is well suited to lexical frameworks that embrace, or are consistent with, nonmonotonicity and inheritance of properties through networks of nodes. But those two dispositions hardly constitute a restrictive notion of suitability in the context of contemporary NLP work, nor are they absolute requirements: it is, for example, entirely possible to write usefulDATR
fragments that never override inherited values (and so are monotonic) or that define isolated nodes with no inheritance. … There is, for example, no built-in assumption that lexicons should be lexeme-based rather than, say, word- or morpheme-based. Unlike some other NLP inheritance languages,DATR
is not intended to provide the facilities of a particular syntactic formalism. Rather, it is intended to be a lexical formalism that can be used with any syntactic representation that can be encoded in terms of attributes and values.
- QUOTE: Much recent research on the design of natural language lexicons has made use of nonmonotonic inheritance networks as originally developed for general knowledge representation purposes in Artificial Intelligence.
1995
- (Keller, 1995) ⇒ Bill Keller. (1995). “DATR theories and DATR models.” In: Proceedings of the 33rd annual meeting on Association for Computational Linguistics (ACL-1995).
- QUOTE: DATR was introduced by Evans and Gazdar (1989a; 1989b) as a simple, declarative language for representing lexical knowledge in terms of path/value equations. The language lacks many of the constructs found in general purpose, knowledge representation formalisms, yet it has sufficient expressive power to capture concisely the structure of lexical information at a variety of levels of linguistic description. At the present time, DATR is probably the most widely-used formalism for representing natural language lexicons in the natural language processing (NLP) community. There are around a dozen different implementations of the language and large DATR lexicons have been constructed for use in a variety of applications (Cahill and Evans, 1990; Andry et al., 1992; Cahill, 1994). DATR has been applied to problems in inflectional and derivational morphology (Gazdar, 1992; Kilbury, 1992; Corbett and Fraser, 1993), lexical semantics (Kilgariff, 1993), morphonology (Cahill, 1993), prosody (Gibbon and Bleiching, 1991) and speech (Andry et al., 1992).