LaserTagger
Jump to navigation
Jump to search
A LaserTagger is a Sequence Tagging System that coverts text sequences to tags by learning text editing.
- Context
- Source code available at : https://github.com/google-research/lasertagger
- It was developed by Malmi et al. (2019).
- It can range from a LaserTaggerFF to being a LaserTaggerAR.
- Example(s):
lasertagger
- Source Code,- …
- Counter-Example(s):
- See: GEC Sequence Tagging System, BERT System, Seq2Seq Network, Transformer Network, Grammatical Error Correction System, Document Summarization System, Encode-Decode Neural Network.
References
2019
- (Malmi et al., 2019) ⇒ Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, and Aliaksei Severyn. (2019). “Encode, Tag, Realize: High-Precision Text Editing.” In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019).
- QUOTE: Our contributions are the following:
- 1) We demonstrate that many text generation tasks with overlapping inputs and outputs can be effectively treated as text editing tasks.
- 2) We propose LaserTagger — a sequence tagging-based model for text editing, together with a method for generating the tag vocabulary from the training data.
- 3) We describe two versions of the tagging model: (i) LaserTaggerFF — a tagger based on BERT (Devlin et al., 2019) and (ii) LaserTaggerAR — a novel tagging model combining the BERT encoder with an autoregressive Transformer decoder, which further improves the results over the BERT tagger.
- 4) We evaluate LaserTagger against strong seq2seq baseline models based on the BERT architecture. Our baseline models outperform previously reported state-of-the-art results on two tasks.
- 5) We demonstrate that a) LaserTaggerAR achieves state-of-the-art or comparable results on 3 out of 4 examined tasks, b) LaserTaggerFF is up to 100× faster at inference time with performance comparable to the state-of-the-art seq2seq models. Furthermore, both models: c) require much less training data compared to the seq2seq models, d) are more controllable and interpretable than seq2seq models due to the small vocabulary of edit operations, e) are less prone to typical seq2seq model errors, such as hallucination.