Linguistic Resource
(Redirected from linguistic resource)
Jump to navigation
Jump to search
A Linguistic Resource is an Artifact that describes some aspect of a Natural Language and can be used by a Natural Language Processing System.
- See: Corpora, Genomic Resource.
Resources
2010
- (Labropoulou et al., 2010) ⇒ Penny Labropoulou, Elina Desypri, Stelios Piperidis. (2010). “Report on the Scientific, Organizational and Economic Methods and Models for Building and Maintaining LRs." FLaReNet. ECP-2007-LANG-617001
2007
- (Kakkonen, 2007) ⇒ Tuomo Kakkonen. (2007). “Framework and Resources for Natural Language Evaluation." Academic Dissertation. University of Joensuu.
- Evaluation of the correctness of a parser’s output is generally done by comparing the system output to correct human-constructed structures. These gold standard parses are obtained from a linguistic resource. Section 6.1 analyzes existing linguistic resources and their suitability for parser evaluation. Linguistic annotation (hereafter referred to as annotation) refers to the notations applied to language data that describes its information content. The annotation in a treebank, for example, includes at least POS tags and syntactic tags. An annotation scheme refers to the specification of a set of practices used for annotation in a particular linguistic resource. An encoding scheme defines the way in which the annotated data is represented. I will both introduce the annotation and encoding schemes used in existing linguistic resources and analyze their suitability for parser
- The most commonly used linguistic resources for parser evaluation are treebanks, which are collections of syntactically annotated sentences. These syntactically annotated corpora consist of sentences which have been assigned parse trees with at least syntactic and morphosyntactic annotation. …
2004
- (Doddington et al., 2004) ⇒ George Doddington, A. Mitchell, M. Przybocki, L. Ramshaw, S. Strassel, and R. Weischedel. (2004). “The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation.” In: Proceedings of Conference on Language Resources and Evaluation (LREC 2004).
- Under the ACE (NIST 2003) and DARPA TIDES (TIDES 2004) Programs, the Linguistic Data Consortium at the University of Pennsylvania develops annotation guidelines, corpora and other linguistic resources to support information extraction research (LDC 2004).
2003
- (Bernadi et al., 2003) ⇒ Raffaella Bernardi, Valentin Jijkoun, Gilad Mishne, and Maarten de Rijke\n. (2003). “Selectively Using Linguistic Resources Throughout the Question Answering Pipeline.” In: Proceedings of the 2nd CoLogNET-ElsNET Symposium.