2014 LearningCharacterLevelRepresent
- (Santos & Zadrozny, 2014) ⇒ Cicero Nogueira dos Santos, and Bianca Zadrozny. (2014). “Learning Character-level Representations for Part-of-Speech Tagging.” In: Proceedings of the 31th International Conference on Machine Learning, (ICML 2014).
Subject Headings: Distributed Word Representation System; Character Embedding System; Santos-Zadrozny Character Embedding System
Notes
Cited By
Quotes
Abstract
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
References
BibTeX
@inproceedings{2014_LearningCharacterLevelRepresent, author = {Cicero Nogueira dos Santos and Bianca Zadrozny}, title = {Learning Character-level Representations for Part-of-Speech Tagging}, booktitle = {Proceedings of the 31th International Conference on Machine Learning, (ICML 2014)}, series = {JMLR Workshop and Conference Proceedings}, volume = {32}, pages = {1818--1826}, publisher = {JMLR.org}, year = {2014}, url = {http://proceedings.mlr.press/v32/santos14.html}, }
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2014 LearningCharacterLevelRepresent | Cicero Nogueira dos Santos Bianca Zadrozny | Learning Character-level Representations for Part-of-Speech Tagging | 2014 |