2010 LTPAChineseLanguageTechnologyPl
- (Che et al., 2010) ⇒ Wanxiang Che, Zhenghua Li, and Ting Liu. (2010). “LTP: A Chinese Language Technology Platform.” In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations.
Subject Headings:
Notes
- Other Version(s) and Link(s):
Cited By
- Google Scholar: ~ 443 Citations, Retrieved: 2020-06-04.
- Semantic Scholar: ~ 262 Citations, Retrieved: 2020-06-04.
- MS Academic: ~ 394 Citations, Retieved 2020-06-04
Quotes
Abstract
LTP (Language Technology Platform) is an integrated Chinese processing platform which includes a suite of high performancenatural language processing (NLP) modules and relevant corpora. Especially for the syntactic and semantic parsing modules, we achieved good results in some relevant evaluations, such as CoNLL and SemEval. Based on XML internal data representation, users can easily use these modules and corpora by invoking DLL (Dynamic Link Library) or Web service APIs (Application Program Interface), and view the processing results directly by the visualization tool.
1 Introduction
A Chinese natural language processing (NLP) platform always includes lexical analysis (word segmentation, part-of-speech tagging, named entity recognition), syntactic parsing and semantic parsing (word sense disambiguation, semantic role labeling) modules. It is a laborious and time consuming work for researchers to develop a full NLP Platform, especially for Chinese, which has fewer existing NLP tools. Therefore, it should be of particular concern to build an integrated Chinese processing platform. There are some key problems for such a platform: providing high performance language processing modules, integrating these modules smoothly, using processing results conveniently, and showing processing results directly. LTP (Language Technology Platform), a Chinese processing platform, is built to solve the above mentioned problems. It uses XML to transfer data through modules and provides all sorts of high performance Chinese processing modules, some DLL or Web service APIs, visualization tools, and some relevant corpora.
2 Language Technology Platform
LTP (Language Technology Platform) is an integrated Chinese processing platform. Its architecture is shown in Figure 1. From bottom to up, LTP comprises 6 components: 1) Corpora, 2) Various Chinese processing modules, 3) XML based internal data presentation and processing, 4) DLL API, ⑤ Web service, and 5) Visualization tool. In the following sections, we will introduce these components in detail
2.1 Corpora
Many NLP tasks are based on annotated corpora. We distributed two key corpora used by LTP. First, WordMap is a Chinese thesaurus which contains 100,093 words. In WordMap, each word sense belongs to a five-level categories. There are 12 top, about 100 second and 1,500 third level,and more fourth and fifth level categories.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2010 LTPAChineseLanguageTechnologyPl | Ting Liu Wanxiang Che Zhenghua Li | LTP: A Chinese Language Technology Platform | 2010 |