MaskGAN Network Model
A MaskGAN Network Model is a Generative Adversarial Network that fills-in missing text conditioned on context-based masked sequences.
- Context:
- It is based on Sequence-to-Sequence (seq2seq) Neural Network architecture.
- It is composed by a Generator Neural Network, Discriminator Neural Network and a Critic Neural Network. Both, generator and discriminator networks consists of an encoding-decoding module.
- Generator Network - the generator-encoder reads masked sequences and generator-decoder fills-in missing tokens by using the encoder hidden states.
- Discriminator Network - It has an identical architecture to the generator network. It receives both filled-in sequence from the generator and original context as inputs. It computes the probability of each token being real given the true context of the masked sequence.
- Critic Network - It is implemented as an additional head-off to the discriminator and it estimates the discounted total return of the filled-in sequence.
- It can be trained using MaskGAN Training System and evaluated by a MaskGAN Benchmark Task.
- Its software repository is available at https://github.com/tensorflow/models/tree/master/research/maskgan
- Example(s):
- the NN model described in Fedus et al. (2018),
- …
- Counter-Example(s):
- GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution(GSGAN),
- Long Text Generation via Adversarial Training with Leaked Information (LeakGAN),
- Maximum-Likelihood Augmented Discrete Generative Adversarial Networks (MaliGAN),
- Adversarial Ranking for Language Generation (RankGAN),
- Sequence Generative Adversarial Nets with Policy Gradient (SeqGAN),
- Adversarial Feature Matching for Text Generation (TextGAN).
- See: Neural Text Generation System, Seq2Seq Model, Neural Autoregressice Model, Professor Forcing Algorithm, Scheduled Sampling Algorithm.
References
2018
- (Fedus et al., 2018) ⇒ William Fedus, Ian Goodfellow, and Andrew M Dai. (2018). “MaskGAN: Better Text Generation via Filling in the ________". In: Proceedings of the Sixth International Conference on Learning Representations (ICLR-2018).
- QUOTE: We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. (...)
The task of imputing missing tokens requires that our MaskGAN architecture condition on information from both the past and the future. We choose to use a seq2seq (Sutskever et al., 2014) architecture. Our generator consists of an encoding module and decoding module.
(...)The encoder reads in the masked sequence, which we denote as $\mathbf{m (x)}$, where the mask is applied element-wise. The encoder provides access to future context for the MaskGAN during decoding.
As in standard language-modeling, the decoder fills in the missing tokens auto-regressively, however, it is now conditioned on both the masked text $\mathbf{m (x)}$ as well as what it has filled-in up to that point. The generator decomposes the distribution over the sequence into an ordered conditional sequence ...
(...)The discriminator has an identical architecture to the generator except that the output is a scalar probability at each time point, rather than a distribution over the vocabulary size. The discriminator is given the filled-in sequence from the generator, but importantly, it is given the original real context $\mathbf{m(x)}$. We give the discriminator the true context, otherwise, this algorithm has a critical failure mode.
(...)
- QUOTE: We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context.
.