GM-RKB WikiText Error Correction (WTEC) Benchmark Task
(Redirected from GM-RKB WTEC System Benchmark Task)
Jump to navigation
Jump to search
A GM-RKB WikiText Error Correction (WTEC) Benchmark Task is a WikiText Error Correction (WTEC) System Benchmark Task that evaluates the performance of the GM-RKB WikiText Error Correction (WTEC) Systems.
- Context:
- Task Input: GM-RKB Benchmark Datasets (GM-RKB Datasets, and Wikipedia Datasets).
- Task Output: GM-RKB WTEC System Evaluation Score.
- Task Requirement(s):
- Baseline Models: Character-level MLE Language Model-based WikiFixer, and Seq2Seq NNet-based WikiFixer.
- a GM-RKB Data Pre-Processing and Noise Generation System that adds human-like editing errors to WikiText,
- a GM-RKB WTEC Model Training System.
- a GM-RKB WikiText Error Correction (WTEC) Benchmark Precision-based Performance Metric :
$Evaluation\,Score = 1 \times TPcount − 5 \times FPcount$
where $TPcount$ is the GM-RKB WTEC System's True Positives and $FPcount$ is the GM-RKB WTEC System's False Negatives count, respectively
- Example(s):
- GM-RKB WTEC System Benchmark Task output (GM-RKB WTEC Models training on GM-RKB Datasets):
Model | TP | FP | Score |
---|---|---|---|
JamSpell | 18,324 | 460,916 | -2,286,256 |
Pyenchant | 18,630 | 4,717,170 | -23,567,220 |
WikiFixer MLE | 9,838 | 449 | 7,593 |
WikiFixer NNet GM-RKB | 16,061 | 696 | 12,581 |
WikiFixer NNet Wikipedia | 8,678 | 524 | 6,058 |
Wikifixer NNet Wikipedia pretrained + GM-RKB | 13,841 | 490 | 11,391 |
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB | 16,003 | 652 | 12,743 |
- GM-RKB WTEC System Benchmark Task output (GM-RKB WTEC Models training on Wikipedia Datasets):
Model | TP | FP | Score |
---|---|---|---|
JamSpell | 11,479 | 312,809 | -1,552,566 |
Pyenchant | 9,656 | 8,351,825 | -41,749,469 |
WikiFixer MLE | 252 | 166 | -578 |
WikiFixer NNet GM-RKB | 3,954 | 287 | 2,519 |
WikiFixer NNet Wikipedia | 6,385 | 2,11 | 5,330 |
Wikifixer NNet Wikipedia pretrained + GM-RKB | 3,284 | 160 | 2,484 |
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB | 6,056 | 277 | 4,671 |
- Counter-Example(s):
- See: WikiText Error Correction (WTEC) System, GM-RKB Wikification System, Natural Language Processing System, Misspelling Correction System, Parsing System, Wiki Markup Language, Text Error Correction System, Seq2Seq Encoder-Decoder Neural Network, GM-RKB Seq2Seq Encoder-Decoder Neural Network, GM-RKB Character-Level MLE Language Model, Language Model, Character-Level MLE Language Model.
References
2020
- (Melli et al., 2020) ⇒ Gabor Melli, Abdelrhman Eldallal, Bassim Lazem, and Olga Moreira (2020). “GM-RKB WikiText Error Correction Task and Baselines.”. In: Proceedings of the 12th Language Resources and Evaluation Conference (LREC-2020).
- QUOTE: We designed and implemented the GM-RKB WikiText Error Correction (WEC) Task to benchmark systems that attempt to automatically recognize and repair simple typographical errors in WikiText based on frequent patterns observed in the corpus. The task consisted in conducting a series of experiments on benchmark datasets to find the best performing WEC system. We adopted a precision-based performance metric because we were interested in measuring of the balance between the welcome benefit a WEC system succeeding in repairing an error correctly against the significant cost of it introducing an error which requires to be repaired manually. We compared the relative performance of a character MLE Language Model-based and a sequence-to-sequence (seq2seq) neural network-based WEC, as well as two spelling error correction systems trained on GM-RKB and Wikipedia corpora datasets. Because of the difficulty in logging real wikitext errors introduced by human editors, we developed a sub-system that artificially can add human-like editing errors to the original text and convert it to training data. (...)
The GM-RKB WTEC Benchmark Task can be divided in 3 main sub-tasks: (1) the creation and preparation of the training dataset; (2) the training of WEC models on the datasets; and (3) the analysis of the relative performance of WEC systems. Fig.1 shows a schematic view of our benchmarking process.
- QUOTE: We designed and implemented the GM-RKB WikiText Error Correction (WEC) Task to benchmark systems that attempt to automatically recognize and repair simple typographical errors in WikiText based on frequent patterns observed in the corpus. The task consisted in conducting a series of experiments on benchmark datasets to find the best performing WEC system. We adopted a precision-based performance metric because we were interested in measuring of the balance between the welcome benefit a WEC system succeeding in repairing an error correctly against the significant cost of it introducing an error which requires to be repaired manually. We compared the relative performance of a character MLE Language Model-based and a sequence-to-sequence (seq2seq) neural network-based WEC, as well as two spelling error correction systems trained on GM-RKB and Wikipedia corpora datasets. Because of the difficulty in logging real wikitext errors introduced by human editors, we developed a sub-system that artificially can add human-like editing errors to the original text and convert it to training data.