LSTM-based Encoder-Decoder Network
Jump to navigation
Jump to search
An LSTM-based Encoder-Decoder Network is an RNN/RNN-based encoder-decoder model composed of LSTM models (an LSTM encoder and an LSTM decoder).
- Context:
- It can be trained by a LSTM-based Encoder/Decoder RNN Training System.
- Example(s):
- Counter-Example(s):
- See: Neural seq2seq, Bidirectional LSTM.
References
2018
- (Brownlee, 2018) ⇒ Jason Brownlee. (2018). “Encoder-Decoder Recurrent Neural Network Models for Neural Machine Translation." Blog Post
- QUOTE: After reading this post, you will know:
- The encoder-decoder recurrent neural network architecture is the core technology inside Google’s translate service.
- The so-called “Sutskever model” for direct end-to-end machine translation.
- The so-called “Cho model” that extends the architecture with GRU units and an attention mechanism.
- QUOTE: After reading this post, you will know:
2017
- (Robertson, 2017) ⇒ Sean Robertson. (2017). “Translation with a Sequence to Sequence Network and Attention.” In: TensorFlow Tutorials
- QUOTE: A basic sequence-to-sequence model, as introduced in Cho et al., 2014 , consists of two recurrent neural networks (RNNs): an encoder that processes the input and a decoder that generates the output. This basic architecture is depicted below. ...
... Each box in the picture above represents a cell of the RNN, most commonly a GRU cell or an LSTM cell (see the RNN Tutorial for an explanation of those). Encoder and decoder can share weights or, as is more common, use a different set of parameters. Multi-layer cells have been successfully used in sequence-to-sequence models too, e.g. for translation Sutskever et al., 2014 .
- QUOTE: A basic sequence-to-sequence model, as introduced in Cho et al., 2014 , consists of two recurrent neural networks (RNNs): an encoder that processes the input and a decoder that generates the output. This basic architecture is depicted below. ...
2014a
- (Sutskever et al., 2014) ⇒ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. (2014). “Sequence to Sequence Learning with Neural Networks.” In: Advances in Neural Information Processing Systems.