Phrase-based Statistical Machine Translation System
Jump to navigation
Jump to search
A Phrase-based Statistical Machine Translation System is a Statistical Machine Translation System that is based on a phrase-based decoder.
- AKA: Phrase-based SMT System.
- Context:
- It can solve a Phrase-based SMT Task by implementing a Phrase-based SMT Algorithm.
- It usually requires a phrase translation table.
- Example(s):
- Counter-Example(s):
- See: MOSES, Automatic Grammatical Error Correction System, Neural Machine Translation System, Language Model Training System.
References
2016
- (Junczys-Dowmunt & Grundkiewicz, 2016) ⇒ Marcin Junczys-Dowmunt, and Roman Grundkiewicz. (2016). “Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction.” In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016).
- QUOTE: We have shown that a pure SMT system actually outperforms the best reported results for any paradigm in GEC if correct parameter tuning is performed. With this tuning mechanism available, task-specific features have been explored that bring further significant improvements, putting phrase-based SMT ahead of other approaches by a large margin. None of the explored features require complicated pipelines or re-ranking mechanisms. Instead they are a natural part of the log-linear model in phrase-based SMT. It is therefore quite easy to reproduce our results and the presented systems may serve as new baselines for automatic grammatical error correction. Our systems and scripts have been made available for better reproducibility.
2007
- (Koehn et al., 2007) ⇒ Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. (2007). “Moses: Open Source Toolkit for Statistical Machine Translation". In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions (ACL 2007).
- QUOTE: The current phrase-based approach to statistical machine translation is limited to the mapping of small text chunks without any explicit use of linguistic information, be it morphological, syntactic, or semantic. These additional sources of information have been shown to be valuable when integrated into pre-processing or post-processing steps.
Moses also integrates confusion network decoding, which allows the translation of ambiguous input. This enables, for instance, the tighter integration of speech recognition and machine translation. Instead of passing along the one-best output of the recognizer, a network of different word choices may be examined by the machine translation system.
Efficient data structures in Moses for the memory-intensive translation model and language model allow the exploitation of much larger data resources with limited hardware.
- QUOTE: The current phrase-based approach to statistical machine translation is limited to the mapping of small text chunks without any explicit use of linguistic information, be it morphological, syntactic, or semantic. These additional sources of information have been shown to be valuable when integrated into pre-processing or post-processing steps.