Spelling Error Correction (SEC) System
A Spelling Error Correction (SEC) System is an text error correcting system that implements a spell checking algorithm to solve a spell checking task.
- AKA: Spell Checker, Spell Checking System.
- Context:
- It can range from being a Heuristic Spell Checking System to being a Data-Driven Spell Checking System (such as a supervised spell checking system).
- It can range from being a Context-Sensitive Spelling Correction Error System to being a Domain-Specific Spelling Correction Error System.
- It can range from being a Word-by-Word Spelling Error Correction System, to being a Sentence-by-Sentence Spelling Error Correction System.
- It can range from being an Offline Spelling Error Correction System. to being a Online Spelling Error Correction System.
- It can be part of a Orthographical Error Correction System.
- It can be supported by a Typographical Error Correction System.
- Example(s):
- a Cluster-based Spelling Correction Error System such as:
- a Context-Sensitive Spelling Correction Error System such as:
- a Dictionary-based Spelling Correction Error System,
- a Domain-Specific Spelling Correction Error System such as:
- a Deep Learning Spelling Error Correction System such as:
- a Rule-based Spelling Error Correction System,
- a Word Processor Spell Checker such as:
- an Email Client Spell Checker,
- Gorin SPELL Program,
- GNU Aspell,
- Hunspell,
- PyEnchant.
- …
- Counter-Example(s):
- See: Search Engine, Application Software, Word Processor, Email Client.
References
2020
- (Thunderbird Addons, 2020) ⇒ https://addons.thunderbird.net/en-US/thunderbird/addon/openmedspel-medical-spellin/ Retrieved:2010-01-19.
- QUOTE: OpenMedSpel is a medical spelling plugin that adds tens of thousands of commonly used USA English medical terms to your standard dictionary, ranging from abdominis to zygomatic, all of which aggravate traditional spellcheck programs and dictionaries.
2019a
- (Gupta, 2019) ⇒ Prabhakar Gupta (2019). "A Context Sensitive Real-Time Spell Checker With Language Adaptability". In: Preprint arXiv:1910.11242.
- QUOTE: We present a context-sensitive real-time spell-checker system which can be adapted to any language. One of the biggest problem earlier was absence of data for languages other than English, so we propose three approaches to create noisy channel datasets of real-world typographic errors. We use Wikipedia data for creating dictionaries and synthesizing test data. To compensate for resource-scarcity of most languages we also use manually curated movie subtitles since it provides information about how people communicate ...
2019b
- (Gupta et al., 2019) ⇒ Jai Gupta, Zhen Qin, Michael Bendersky, and Donald Metzler. (2019, May). "Personalized Online Spell Correction for Personal Search". In: The World Wide Web Conference. ACM.
- QUOTE: In this work, we propose a simple and effective personalized spell correction solution that augments existing global solutions for search over private corpora. Our event driven spell correction candidate generation method is specifically designed with personalization as the key construct. Our novel spell correction and query completion algorithms do not require complex model training and is highly efficient. The proposed solution has shown over 30% clickthrough rate gain on affected queries when evaluated against a range of strong commercial personal search baselines - Google’s Gmail, Drive, and Calendar search production systems.
2018a
- (Etoori et al., 2018) ⇒ Pravallika Etoori, Manoj Chinnakotla, and Radhika Mamidi. (2018). “Automatic Spelling Correction for Resource-Scarce Languages Using Deep Learning.” In: Proceedings of ACL 2018, Student Research Workshop.
- QUOTE: In this paper, we approach the spelling correction problem for Indic languages with Deep learning. This model can be employed for any resource-scarce language. We propose a character based Sequence-to-sequence text Correction Model for Indic Languages (SCMIL) which trains end-to-end.
2018b
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/spell_checker Retrieved:2018-3-23.
- In computing, a spell checker (or spell check) is an application program that flags words in a document that may not be spelled correctly. Spell checkers may be stand-alone, capable of operating on a block of text, or as part of a larger application, such as a word processor, email client, electronic dictionary, or search engine.
2017
- (Xue, 2017) ⇒ Harry Xue. (2017). “Context-sensitive Spell Correction with Deep Learning." In: Makers (Under Armour Inc.) Blog post.
- QUOTE: How does seq2seq work? On a high level it mimics the way we as humans process language. Seq2seq reads in a sentence, gets a sense of the entire sentence’s “meaning” or context and then performs its assigned task. (...)
We used this same intuition in developing our context-sensitive spell correction system, which we aptly named seq2spell. Here, instead of translating from one language to another, we “translate” from possibly misspelled lines in English to their corrected versions. Seq2spell reads in a possibly misspelled line, encodes it into a representation of its “meaning” and then outputs the corrected line:
- QUOTE: How does seq2seq work? On a high level it mimics the way we as humans process language. Seq2seq reads in a sentence, gets a sense of the entire sentence’s “meaning” or context and then performs its assigned task.
2016
- (Chan, 2016) ⇒ Tsz Ching Sam Chan (2016). "Third Year Project: A Context-Sensitive Spell Checker Using Trigrams And Confusion Sets". Development, 5, 1-2.
- QUOTE: In a context-sensitive spell checker, more information in the overall context are reviewed. It is often a sentence-bysentence checking rather than a word-by-word checking in the checking stage. When generating suggestions, it will analyse the surrounding context and gives the most suitable words depending on the context instead of the lexicon. Suggestions are often generated from confusion sets, sets of words that are easy to confuse with one another. For example,
{peace, piece}
is a confusion set as they often confuse to each other. Different types of spell checker has different approach in context extraction and it affects the checking process and confusion set generation.
- QUOTE: In a context-sensitive spell checker, more information in the overall context are reviewed. It is often a sentence-bysentence checking rather than a word-by-word checking in the checking stage. When generating suggestions, it will analyse the surrounding context and gives the most suitable words depending on the context instead of the lexicon. Suggestions are often generated from confusion sets, sets of words that are easy to confuse with one another. For example,
2014
- (Zampieri & De Amorim, 2014) ⇒ Marcos Zampieri, and Renato Cordeiro de Amorim (2014). "Between Sound and Spelling: Combining Phonetics and Clustering Algorithms to Improve Target Word Recovery". In: In: Przepiórkowski A., Ogrodniczuk M. (eds). Advances in Natural Language Processing. NLP 2014. Lecture Notes in Computer Science, 8686. DOI:10.1007/978-3-319-10888-9_43
2013
- (De Amorim & Zampieri, 2013) ⇒ Renato Cordeiro de Amorim, Marcos Zampieri (2013). "Effective Spell Checking Methods Using Clustering Algorithms". In: Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2013).
2011
- (Earnest, 2011) ⇒ Les Earnest (2011). "The First Three Spelling Checkers". In:
- QUOTE: SPELL is a program designed to read text files and check them for correctness of spelling. In addition to the spelling check, the program provides a means for correcting words that it thinks are misspelled. This program was written by Ralph E. Gorin of Stanford University Artificial Intelligence Laboratory. It has been augmented by William Plummer and Jerry Wolf of BBN.
In its normal mode of usage, SPELL reads through an input text file, asks the user about each word it does not recognize, and creates an output file in which corrections have been made.
- QUOTE: SPELL is a program designed to read text files and check them for correctness of spelling. In addition to the spelling check, the program provides a means for correcting words that it thinks are misspelled. This program was written by Ralph E. Gorin of Stanford University Artificial Intelligence Laboratory. It has been augmented by William Plummer and Jerry Wolf of BBN.
2009
- (Chitu, 2009) ⇒ Alex Chitu (2009). "Google's Context-Sensitive Spell Checker". In: Google Blogspot.
- QUOTE: Google Wave, the service demoed yesterday at Google I/O, includes a context-sensitive spell checker that highlights errors as you type. Google uses the language models built for Google Translate to find words that don't belong in a certain context.
1999
- (Golding & Roth, 1999) ⇒ Andrew R. Golding, and Dan Roth (1999). "A Winnow-Based Approach to Context-Sensitive Spelling Correction". In: Machine learning, 34(1-3), 107-130. DOI:10.1023/A:1007545901558
1996
- (Golding & Schabes, 1996) ⇒ Andrew Golding, and Yves Schabes (1996). "Combining Trigram-Based and Feature-Based Methods for Context-Sensitive Spelling Correction". In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. DOI:10.3115/981863.981873.
1995
- (Golding, 1995), ⇒ Andrew Golding (1995). "A Bayesian Hybrid Method for Context-sensitive Spelling Correction". In: Proceedings of Third Workshop on Very Large Corpora.