LaserTagger

Context
- Source code available at : https://github.com/google-research/lasertagger
- It was developed by Malmi et al. (2019).
- It can range from a LaserTagger_FF to being a LaserTagger_AR.
Example(s):
- lasertagger - Source Code,
- …
Counter-Example(s):
- GECToR,
- Lex-POS.
See: GEC Sequence Tagging System, BERT System, Seq2Seq Network, Transformer Network, Grammatical Error Correction System, Document Summarization System, Encode-Decode Neural Network.

References

1) We demonstrate that many text generation tasks with overlapping inputs and outputs can be effectively treated as text editing tasks.

2) We propose LaserTagger — a sequence tagging-based model for text editing, together with a method for generating the tag vocabulary from the training data.

3) We describe two versions of the tagging model: (i) LaserTagger_FF — a tagger based on BERT (Devlin et al., 2019) and (ii) LaserTagger_AR — a novel tagging model combining the BERT encoder with an autoregressive Transformer decoder, which further improves the results over the BERT tagger.

4) We evaluate LaserTagger against strong seq2seq baseline models based on the BERT architecture. Our baseline models outperform previously reported state-of-the-art results on two tasks.

5) We demonstrate that a) LaserTagger_AR achieves state-of-the-art or comparable results on 3 out of 4 examined tasks, b) LaserTagger_FF is up to 100× faster at inference time with performance comparable to the state-of-the-art seq2seq models. Furthermore, both models: c) require much less training data compared to the seq2seq models, d) are more controllable and interpretable than seq2seq models due to the small vocabulary of edit operations, e) are less prone to typical seq2seq model errors, such as hallucination.