Neural Machine Translation (NMT) Algorithm
(Redirected from Neural machine translation (NMT) model)
Jump to navigation
Jump to search
A Neural Machine Translation (NMT) Algorithm is a data-driven MT algorithm that uses a neural NLP algorithm.
- Context:
- It can be implemented by a Deep Neural Network-based Machine Translation System (to solve a deep neural network-based machine translation task).
- It can range from being a Character-based Neural Machine Translation Algorithm to being a Word-based Neural Machine Translation Algorithm.
- Example(s):
- Counter-Example(s):
- See: Neural Parsing Algorithm, Parsey McParseface, Attention Mechanism.
References
2018
- (Belinkov & Bisk, 2018) ⇒ Yonatan Belinkov, and Yonatan Bisk. (2018). “Synthetic and Natural Noise Both Break Neural Machine Translation.” In: Proceedings of 6th International Conference on Learning Representations (ICLR-2018).
2018b
- (Hoang et al., 2018) ⇒ Vu Cong Duy Hoang, Philipp Koehn, Gholamreza Haffari, and Trevor Cohn. (2018). “Iterative Back-translation for Neural Machine Translation.” In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation.
- ABSTRACT: We present iterative back-translation, a method for generating increasingly better synthetic parallel data from monolingual data to train neural machine translation systems. Our proposed method is very simple yet effective and highly applicable in practice. We demonstrate improvements in neural machine translation quality in both high and low resourced scenarios, including the best reported BLEU scores for the WMT 2017 German↔English tasks.
2017a
- (Manning & Socher, 2017i) ⇒ Christopher Manning, and Richard Socher. (2017). “Lecture 10 - Neural Machine Translation and Models with Attention.” In: Natural Language Processing with Deep Learning - Stanford CS224N Ling284.
2017b
- (Monroe, 2017) ⇒ Don Monroe. (2017). “Deep Learning Takes on Translation.” In: Communications of the ACM Journal, 60(6). doi:10.1145/3077229
- QUOTE: … Most implementations of translation employ two neural networks. The first, called the encoder, processes input text from one language to create an evolving fixed-length vector representation of the evolving input. A second "decoder" network monitors this vector to produce text in a different language. Typically, the encoder and decoder are trained as a pair for each choice of source and target language.
An additional critical element is the use of "attention," which Cho said was "motivated from human translation." As translation proceeds, based on what has been translated so far, this attention mechanism selects the most useful part of the text to translate next.
- QUOTE: … Most implementations of translation employ two neural networks. The first, called the encoder, processes input text from one language to create an evolving fixed-length vector representation of the evolving input. A second "decoder" network monitors this vector to produce text in a different language. Typically, the encoder and decoder are trained as a pair for each choice of source and target language.
2016
- (Sennrich et al., 2016) ⇒ Rico Sennrich, Barry Haddow, and Alexandra Birch. (2016). “Neural Machine Translation of Rare Words with Subword Units.” In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- QUOTE: Neural machine translation (NMT) models typically operate with a fixed vocabulary, but translation is an open-vocabulary problem. Previous work addresses the translation of out-of-vocabulary words by backing off to a dictionary. In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units. ...
2016
2014b
- (Bahdanau et al., 2014) ⇒ Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. (2014). “Neural Machine Translation by Jointly Learning to Align and Translate.” arXiv preprint arXiv:1409.0473
2014a
- (Sutskever et al., 2014) ⇒ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. (2014). “Sequence to Sequence Learning with Neural Networks.” In: Advances in Neural Information Processing Systems.