2016 GooglesNeuralMachineTranslation

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Neural MT Algorithm; Neural MT System; Memory-Augmented Neural Network.

Notes

Cited By

2017

Quotes

Abstract

Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.

References

BibTeX

@article{2016_GooglesNeuralMachineTranslation,
  author    = {Yonghui Wu and
               Mike Schuster and
               Zhifeng Chen and
               Quoc V. Le and
               Mohammad Norouzi and
               Wolfgang Macherey and
               Maxim Krikun and
               Yuan Cao and
               Qin Gao and
               Klaus Macherey and
               Jeff Klingner and
               Apurva Shah and
               Melvin Johnson and
               Xiaobing Liu and
               Lukasz Kaiser and
               Stephan Gouws and
               Yoshikiyo Kato and
 [[Taku Kudo]] and
               Hideto Kazawa and
               Keith Stevens and
               George Kurian and
               Nishant Patil and
               Wei Wang and
               Cliff Young and
               Jason Smith and
               Jason Riesa and
               Alex Rudnick and
 [[Oriol Vinyals]] and
               Greg Corrado and
               Macduff Hughes and
               Jeffrey Dean},
  title     = {Google's Neural Machine Translation System: Bridging the Gap between
               Human and Machine Translation},
  journal   = {CoRR},
  volume    = {abs/1609.08144},
  year      = {2016},
  url       = {http://arxiv.org/abs/1609.08144},
  archivePrefix = {arXiv},
  eprint    = {1609.08144},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2016 GooglesNeuralMachineTranslationTaku Kudo
Wei Wang
Jeffrey Dean
Mike Schuster
Greg Corrado
Oriol Vinyals
Quoc V. Le
Xiaobing Liu
Zhifeng Chen
Lukasz Kaiser
Yonghui Wu
Mohammad Norouzi
Wolfgang Macherey
Maxim Krikun
Yuan Cao
Qin Gao
Klaus Macherey
Jeff Klingner
Apurva Shah
Melvin Johnson
Stephan Gouws
Yoshikiyo Kato
Hideto Kazawa
Keith Stevens
George Kurian
Nishant Patil
Cliff Young
Jason Smith
Jason Riesa
Alex Rudnick
Macduff Hughes
Google's Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation2016