WMT-14 Statistical Machine Translation Shared Task
A WMT-14 Statistical Machine Translation Shared Task is a WMT-14 Shared Task that for testing the performance of participating SMT systems by evaluating the translation quality between English and other natural languages.
- AKA: WMT-14 Translation Task, WMT-14 News Translation Task.
- Context:
- Task Website: http://www.statmt.org/wmt14/translation-task.html
- Task Input: Text Item in a source language.
- Task Output: Machine translated in the target language.
- Task Requirements:
- It is a Machine Translation Benchmark Task that is part of WMT-14 Workshop.
- Example(s):
- Counter-Example(s):
- See: Statistical Machine Translation System, Neural Machine Translation System, Natural Language Processing System, Natural Language Understanding System, NLP Benchmark Task, OOV Word Translation System.
References
2017a
- (Artetxe et al., 2017) ⇒ Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. (2017). “Unsupervised Neural Machine Translation.” In: ePrint arXiv:1710.11041.
2017b
- (Manning & Socher, 2017) ⇒ Christopher Manning, and Richard Socher. (2017). “Lecture 10 - Neural Machine Translation and Models with Attention.” In: Lecture in Natural Language Processing with Deep Learning - Stanford CS224N Ling284 (2017).
2016
- (Luong et al., 2016) ⇒ Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. (2016). “Multi-task Sequence to Sequence Learning.” In: Proceedings of 4th International Conference on Learning Representations (ICLR-2016).
2015
- (Luong et al., 2015) ⇒ Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. (2015). “Addressing the Rare Word Problem in Neural Machine Translation.” In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL 2015) Volume 1: Long Papers.
2014a
- (Bojar et al., 2014) ⇒ Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, and Ales Tamchyna. (2014). “Findings of the 2014 Workshop on Statistical Machine Translation.” In: Proceedings of the Ninth Workshop on Statistical Machine Translation (WMT@ACL 2014).
- QUOTE: The recurring task of the workshop examines translation between English and other languages. As in the previous years, the other languages include German, French, Czech and Russian.
We dropped Spanish and added Hindi this year. From a linguistic point of view, Spanish poses similar problems as French, making its prior inclusion less valuable. Hindi is not only interesting since it is a more distant language than the European languages we include, but also because we have much less training data, thus forcing researchers to deal with low resource conditions, but also providing them with a language pair that does not suffer from the computational complexities of having to deal with massive amounts of training data.
We created a test set for each language pair by translating newspaper articles and provided training data.
- QUOTE: The recurring task of the workshop examines translation between English and other languages. As in the previous years, the other languages include German, French, Czech and Russian.
2014b
- (Williams et al., 2014) ⇒ Philip Williams, Rico Sennrich, Maria Nadejde, Matthias Huck, Eva Hasler, Philipp Koehn (2014). “Edinburgh's Syntax-Based Systems at WMT 2014". In: Proceedings of the Ninth Workshop on Statistical Machine Translation (WMT@ACL 2014).
- QUOTE: For this year’s WMT shared translation task we built syntax-based systems for six language pairs:
2014c
- (Bicici et al., 2014) ⇒ Ergun Bicici, Qun Liu, and Andy Way (2014). “FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems". In: Proceedings of the Ninth Workshop on Statistical Machine Translation (WMT@ACL 2014).
- QUOTE: We run ParFDA5 SMT experiments for all language pairs in both directions in the WMT14 translation task (Bojar et al., 2014), which include English-Czech (en-cs), English-German (en-de), English-French (en-fr), English Hindi (en-hi), and English-Russian (en-ru).
2014d
- (Neidert et al., 2014) ⇒ Julia Neidert , Sebastian Schuster , Spence Green, Kenneth Heafield, and Christopher D. Manning (2014). “Stanford University's Submissions to the WMT 2014 Translation Task". In: Proceedings of the Ninth Workshop on Statistical Machine Translation (WMT@ACL 2014).
- QUOTE: We describe Stanford's participation in the French-English and English-German tracks of the 2014 Workshop on Statistical Machine Translation (WMT). Our systems used large feature sets, word classes, and an optional unconstrained language model. Among constrained systems, ours performed the best according to uncased BLEU: 36.0% for French English (...)
2014e
- (Sutskever et al., 2014) ⇒ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. (2014). “Sequence to Sequence Learning with Neural Networks.” In: Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems (NIPS 2014).
- QUOTE: The main result of this work is the following. On the WMT'14 English to French translation task, we obtained a BLEU score of 34.81 by directly extracting translations from an ensemble of 5 deep LSTMs (with 380M parameters each) using a simple left-to-right beam-search decoder.