JamSpell

From GM-RKB
Jump to navigation Jump to search

A JamSpell is a Context-Sensitive Spelling Correction Error System developed by Filipp Ozinov.



References

2020a

...
  Errors Top 7 Errors Fix Rate Top 7 Fix Rate Broken Speed (words/second)
JamSpell 3.25% 1.27% 79.53% 84.10% 0.64% 4854
Norvig 7.62% 5.00% 46.58% 66.51% 0.69% 395
Hunspell 13.10% 10.33% 47.52% 68.56% 7.14% 163
Dummy 13.14% 13.14% 0.00% 0.00% 0.00% -
Model was trained on 300K wikipedia sentences + 300K news sentences (english). 95% was used for train, 5% was used for evaluation. Errors model was used to generate errored text from the original one. JamSpell corrector was compared with Norvig's one, Hunspell and a dummy one (no corrections).

2020b

Model TP FP Score
JamSpell 18,324 460,916 -2,286,256
Pyenchant 18,630 4,717,170 -23,567,220
WikiFixer MLE 9,838 449 7,593
WikiFixer NNet GM-RKB 16,061 696 12,581
WikiFixer NNet Wikipedia 8,678 524 6,058
Wikifixer NNet Wikipedia pretrained + GM-RKB 13,841 490 11,391
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB 16,003 652 12,743
Table 1: GM-RKB Testing Dataset Results.

Model TP FP Score
JamSpell 11,479 312,809 -1,552,566
Pyenchant 9,656 8,351,825 -41,749,469
WikiFixer MLE 252 166 -578
WikiFixer NNet GM-RKB 3,954 287 2,519
WikiFixer NNet Wikipedia 6,385 2,11 5,330
Wikifixer NNet Wikipedia pretrained + GM-RKB 3,284 160 2,484
Wikifixer NNet Wikipedia 7,000 pages+GM-RKB 6,056 277 4,671
Table 2: Wikipedia Testing Dataset Results.

2019

All Categories P R F0.5
Norvig 0.5217 0.0355 0.1396
Enchant 0.2269 0.0411 0.1192
JamSpell 0.4385 0.0449 0.1593
our 0.5116 0.0295 0.1198
Table 2: Comparison of Grammatical Error Performance of Spellcheckers. Jamspell achieves the best score as previously suggested.

R:SPELL P R F0.5
Norvig 0.5775 0.6357 0.5882
Enchant 0.316 0.6899 0.3544
JamSpell 0.5336 0.6977 0.5599
our 0.6721 0.5297 0.6378
Table 3: Comparison of spellcheckers on spelling. Our method outperforms other methods.