JamSpell

AKA: JamSpell Spelling Error Correction System, JamSpell Spellchecker, JamSpell Spell Checking System.
Context:
- It is available at https://github.com/bakwc/JamSpell
- It can solve a JamSpell Spelling Error Correction Task by implementing a JamSpell Spelling Error Correction Algorithm.
- It uses a Synthetic Spelling Error Generation System.
- It has been evaluated by a JamSpell Benchmark Task.
Example(s):
- jamspell 0.0.12 - python implementation by Filipp Ozinov (2020).
- …
Counter-Example(s)
See: Spelling Error Correction (SEC) System, Grammatical Error Correction (GEC) System, WikiText Error Correction (WTEC) System, Natural Language Processing (NLP) System.

References

(Ozinov, 2020) ⇒ Github Project: https://github.com/bakwc/JamSpell Retrieved: 2020-02-06
- QUOTE: JamSpell is a spell checking library with following features:
  - accurate - it consider words surroundings (context) for better correction;
  - fast' - near 5K words per second;
  - multi-language - it's written in C++ and available for many languages with swig bindings

...

	Errors	Top 7 Errors	Fix Rate	Top 7 Fix Rate	Broken	Speed (words/second)
JamSpell	3.25%	1.27%	79.53%	84.10%	0.64%	4854
Norvig	7.62%	5.00%	46.58%	66.51%	0.69%	395
Hunspell	13.10%	10.33%	47.52%	68.56%	7.14%	163
Dummy	13.14%	13.14%	0.00%	0.00%	0.00%	-

Model was trained on 300K wikipedia sentences + 300K news sentences (english). 95% was used for train, 5% was used for evaluation. Errors model was used to generate errored text from the original one. JamSpell corrector was compared with Norvig's one, Hunspell and a dummy one (no corrections).

(Melli et al., 2020) ⇒ Gabor Melli, Abdelrhman Eldallal, Bassim Lazem, and Olga Moreira. (2020). “GM-RKB WikiText Error Correction Task and Baselines.”. In: Proceedings of LREC 2020 (LREC-2020).
- QUOTE: Although the task of correcting natural language human-written text is different from that of correcting Wiki pages, we tested and compared spelling correction tools for performance evaluation purposes. We tested JamSpell, a Python library that checks and corrects spelling in text, and Pypyenchant similar spelling tool. JamSepll library has the full sentence as input and considers the context.
  ...
  Tab.1 and 2 summarizes the task results. It shows the number of TP, FTs and the Eq.1 performances score for all the WikiText repairing tools described in Sec. 3.1 trained and tested on GM-RKB and Wikipedia datasets described in Sec.4.

**Table 1:** GM-RKB Testing Dataset Results.
Model	TP	FP	Score
JamSpell	18,324	460,916	-2,286,256
Pyenchant	18,630	4,717,170	-23,567,220
WikiFixer MLE	9,838	449	7,593
WikiFixer NNet GM-RKB	16,061	696	12,581
WikiFixer NNet Wikipedia	8,678	524	6,058
Wikifixer NNet Wikipedia pretrained + GM-RKB	13,841	490	11,391
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB	16,003	652	12,743

**Table 2:** Wikipedia Testing Dataset Results.
Model	TP	FP	Score
JamSpell	11,479	312,809	-1,552,566
Pyenchant	9,656	8,351,825	-41,749,469
WikiFixer MLE	252	166	-578
WikiFixer NNet GM-RKB	3,954	287	2,519
WikiFixer NNet Wikipedia	6,385	2,11	5,330
Wikifixer NNet Wikipedia pretrained + GM-RKB	3,284	160	2,484
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB	6,056	277	4,671

**Table 2**: Comparison of Grammatical Error Performance of Spellcheckers. Jamspell achieves the best score as previously suggested.
All Categories	P	R	F_0.5
Norvig	0.5217	0.0355	0.1396
Enchant	0.2269	0.0411	0.1192
JamSpell	0.4385	0.0449	0.1593
our	0.5116	0.0295	0.1198

**Table 3**: Comparison of spellcheckers on spelling. Our method outperforms other methods.
R:SPELL	P	R	F_0.5
Norvig	0.5775	0.6357	0.5882
Enchant	0.316	0.6899	0.3544
JamSpell	0.5336	0.6977	0.5599
our	0.6721	0.5297	0.6378