Neural Sequence-to-Sequence (seq2seq)-based Model Training System

A Neural Sequence-to-Sequence (seq2seq)-based Model Training System is a encoder-decoder network training system that is a sequence-to-sequence model training system which implements a neural seq2seq training algorithm to solve a seq2seq training task (that produces a trained neural seq2seq-based model).

Context:
- …
Example(s):
Counter-Example(s):
See: LSTM System, Encoder-Decoder with Attention Model Training System.

References

2017

(Keras Blog, 2017) ⇒ https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html
- QUOTE: ... Let's illustrate these ideas with actual code.
  or our example implementation, we will use a dataset of pairs of English sentences and their French translation, which you can download from manythings.org/anki. The file to download is called fra-eng.zip. We will implement a character-level sequence-to-sequence model, processing the input character-by-character and generating the output character-by-character. Another option would be a word-level model, which tends to be more common for machine translation. At the end of this post, you will find some notes about turning our model into a word-level model using Embedding layers.
  The full script for our example can be found on GitHub.
  Here's a summary of our process:
  - 1) Turn the sentences into 3 Numpy arrays, encoder_input_data, decoder_input_data, decoder_target_data:
    - encoder_input_data is a 3D array of shape (num_pairs, max_english_sentence_length, num_english_characters) containing a one-hot vectorization of the English sentences.
    - decoder_input_data is a 3D array of shape (num_pairs, max_french_sentence_length, num_french_characters) containg a one-hot vectorization of the French sentences.
    - decoder_target_data is the same as decoder_input_data but offset by one timestep. decoder_target_data[:, t, :] will be the same as decoder_input_data[:, t + 1, :].
  - 2) Train a basic LSTM-based Seq2Seq model to predict decoder_target_data given encoder_input_data and decoder_input_data. Our model uses teacher forcing.
  - 3) Decode some sentences to check that the model is working (i.e. turn samples from encoder_input_data into corresponding samples from decoder_target_data).

2017a

(Github, 2017) ⇒ https://github.com/IBM/pytorch-seq2seq
- QUOTE: This is a framework for sequence-to-sequence (seq2seq) models implemented in PyTorch. The framework has modularized and extensible components for seq2seq models, training and inference, checkpoints, etc. This is an alpha release. We appreciate any kind of feedback or contribution.
  Seq2seq is a fast evolving field with new techniques and architectures being published frequently. The goal of this library is facilitating the development of such techniques and applications. While constantly improving the quality of code and documentation, we will focus on the following items:
  - Evaluation with benchmarks such as WMT machine translation, COCO image captioning, conversational models, etc;
  - Provide more flexible model options, improving the usability of the library;
  - Adding latest architectures such as the CNN based model proposed by Convolutional Sequence to Sequence Learning and the transformer model proposed by Attention Is All You Need;
  - Support features in the new versions of PyTorch.

2017b

(Jacobs, 2017) ⇒ Kevin Jacobs. (2017). “Create a Character-based Seq2Seq model using Python and Tensorflow." Blog post, 2017-12-14
- QUOTE: … I will share my findings on creating a character-based Sequence-to-Sequence model (Seq2Seq) and I will share some of the results I have found. …
  … The Seq2Seq (sequence-to-sequence) model has the following architecture:

2017c

(Github, 2017) ⇒ https://google.github.io/seq2seq/
- QUOTE: ... tf-seq2seq is a general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summarization, Conversational Modeling, Image Captioning, and more.

Neural Sequence-to-Sequence (seq2seq)-based Model Training System

References

2017

2017a

2017b

2017c

Navigation menu

Search