2009 SecondGenerationAMIRAToolsforAr
- (Diab, 2009) ⇒ Mona T. Diab. (2009). “Second Generation AMIRA Tools for Arabic Processing: Fast and Robust Tokenization, POS Tagging, and Base Phrase Chunking.” In: Proceedings of 2nd International Conference on Arabic Language Resources and Tools.
Subject Headings: Phrase Chunking System.
Notes
Cited By
Quotes
Abstract
In this paper, we address the problem of processing Modern Standard Arabic. We present the second generation of tools that process Arabic (AMIRA). AMIRA is a successor suite to the ASVMTools. The AMIRA toolkit includes a clitic tokenizer (TOK), part of speech tagger (POS) and base phrase chunker (BPC) - shallow syntactic parser. The technology of AMIRA is based on supervised learning with no explicit dependence on explicit modeling or knowledge of deep morphology. AMIRA is based on using a unified framework casting each of the component problems as a classification task. The underlying technology employs Support Vector Machines in a sequence modeling framework using the YAMCHA toolkit. The system is very fast and robust and allows for a number of variable user settings depending on the disambiguation granularity. The AMIRA toolkit has been widely used for different NLP (MT, IE, IR, NER, etc.) applications due to its speed and high performance.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2009 SecondGenerationAMIRAToolsforAr | Mona T. Diab | Second Generation AMIRA Tools for Arabic Processing: Fast and Robust Tokenization, POS Tagging, and Base Phrase Chunking | 2009 |