Grammatical Error Correction (GEC) Algorithm
Jump to navigation
Jump to search
A Grammatical Error Correction (GEC) Algorithm is a text error correction algorithm that can be implemented by a GEC system to solve a GEC task (to correct grammatical errors).
- Context:
- It can range from being a Word/Token-level GEC Algorithm to being a Character-level GEC Algorithm.
- …
- Example(s):
- an Language Model-based GEC Algorithm, such as (Bryant & Briscoe, 2018)'s.
- …
- Counter-Example(s):
- See: Text Sentence, Transcription Error.
References
2018a
- (Bryant & Briscoe, 2018) ⇒ Christopher Bryant, and Ted Briscoe. (2018). “Language Model Based Grammatical Error Correction Without Annotated Training Data.” In: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications.
- QUOTE: ... Since the end of the CoNLL-2014 shared task on grammatical error correction (GEC), research into language model (LM) based approaches to GEC has largely stagnated. In this paper, we re-examine LMs in GEC and show that it is entirely possible to build a simple system that not only requires minimal annotated data (~1000 sentences), but is also fairly competitive with several state-of-the-art systems. …
2018b
- (Ge et al., 2018) ⇒ Tao Ge, Furu Wei, and Ming Zhou. (2018). “Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study.” arXiv preprint arXiv:1807.01270
- ABSTRACT: Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC). Based on the seq2seq framework, we propose a novel fluency boost learning and inference mechanism. Fluency boosting learning generates diverse error-corrected sentence pairs during training, enabling the error correction model to learn how to improve a sentence's fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps. Combining fluency boost learning and inference with convolutional seq2seq models, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5}) on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test set respectively, becoming the first GEC system that reaches human-level performance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks.