Pointer-Generator Sequence-to-Sequence Neural Network

From GM-RKB
(Redirected from Pointer-Generator Model)
Jump to navigation Jump to search

A Pointer-Generator Sequence-to-Sequence Neural Network is a Sequence-to-Sequence Neural Network With Attention that is based on a Pointer Network Model and a Word Generation Probability Function.



References

2017

Figure 3: Pointer-generator model. For each decoder timestep a generation probability $p_{gen} \in [0,1]$ is calculated, which weights the probability of generating words from the vocabulary, versus copying words from the source text. The vocabulary distribution and the attention distribution are weighted and summed to obtain the final distribution, from which we make our prediction. Note that out-of-vocabulary article words such as 2-0 are included in the final distribution. Best viewed in color.

2015

2015 PointerNetworks Fig1.png
Figure 1:(a) Sequence-to-Sequence - An RNN (blue) processes the input sequence to create a code vector that is used to generate the output sequence (purple) using the probability chain rule and another RNN. The output dimensionality is fixed by the dimensionality of the problem and it is the same during training and inference in Sutskever et al.(2014). (b) Ptr-Net - An encoding RNN converts the input sequence to a code (blue) that is fed to the generating network (purple). At each step, the generating network produces a vector that modulates a content-based attention mechanism over inputs (Bahdanau et al., 2015, Graves et al., 2014). The output of the attention mechanism is a softmax distribution with dictionary size equal to the length of the input.