2011 LearningSubWordUnitsforOpenVoca
- (Parada et al., 2011) ⇒ Carolina Parada, Mark Dredze, Abhinav Sethy, and Ariya Rastrow. (2011). “Learning Sub-Word Units for Open Vocabulary Speech Recognition.” In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
Subject Headings: Out-Of-Vocabulary (OOV) Word Detection Task; Open Vocabulary Speech Recognition; LVCSR System; Subword Unit; OOV Word.
Notes
Cited By
- Google Scholar ~ 32 Citations.
Quotes
Abstract
Large vocabulary speech recognition systems fail to recognize words beyond their vocabulary, many of which are information rich terms, like named entities or foreign words. Hybrid word/sub-word systems solve this problem by adding sub-word units to large vocabulary word based systems; new words can then be represented by combinations of sub-word units. Previous work heuristically created the sub-word lexicon from phonetic representations of text using simple statistics to select common phone sequences. We propose a probabilistic model to learn the subword lexicon optimized for a given task. We consider the task of out of vocabulary (OOV) word detection, which relies on output from a hybrid model. A hybrid model with our learned sub-word lexicon reduces error by 6.3% and 7.6% (absolute) at a 5% false alarm rate on an English Broadcast News and MIT Lectures task respectively.
References
BibTeX
@inproceedings{2011_LearningSubWordUnitsforOpenVoca, author = {Carolina Parada and Mark Dredze and Abhinav Sethy and Ariya Rastrow}, editor = {Dekang Lin and Yuji Matsumoto and Rada Mihalcea}, title = {Learning Sub-Word Units for Open Vocabulary Speech Recognition}, booktitle = {Proceeding of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies}, pages = {712--721}, publisher = {The Association for Computer Linguistics}, year = {2011}, url = {https://www.aclweb.org/anthology/P11-1072/}, }
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 LearningSubWordUnitsforOpenVoca | Mark Dredze Carolina Parada Abhinav Sethy Ariya Rastrow | Learning Sub-Word Units for Open Vocabulary Speech Recognition | 2011 |