Approximate String Matching Task
Jump to navigation
Jump to search
An approximate string matching task is a string matching task that is a similarity search task.
- Context:
- Input:
- a string space (with a string set and a string distance function)
- a similarity search type.
- optional: a query point [math]\displaystyle{ q \in P }[/math].
- optional: a distance value [math]\displaystyle{ r \in \mathbb{R} }[/math] (such as a radius)
- optional: a positive integer [math]\displaystyle{ k }[/math].
- It can be solved by an Approximate String Matching System that implements an (approximate string matching algorithm).
- It can be of type:
- a Range-type String Matching Task (a Range Search Task).
- a Nearest Neighbor-type String Matching Task (a Nearest Neighbor Search Task).
- a Reverse Nearest Neighbor-type String Matching Task (a Reverse Nearest Neighbor Search Task).
- a Similarity Join-type String Matching Task (a Similarity Join Search Task).
- Input:
- Example(s):
- Counter-Example(s):
- See: String Distance, String Mismatch.
References
2011
- (Wikipedia, 2011) ⇒ http://en.wikipedia.org/wiki/Approximate_string_matching
- In computing, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately.
2002
- (Navarro & Raffinot, 2002) ⇒ Gonzalo Navarro, and Mathieu Raffinot. (2002). “Flexible Pattern Matching in Strings.” Cambridge University Press.
- QUOTE: Approximate string matching, also called “string matching allowing errors," is the problem of finding a pattern [math]\displaystyle{ p }[/math] in a text [math]\displaystyle{ T }[/math] when a limited number [math]\displaystyle{ k }[/math] of differences is permitted between the pattern and its occurrences in the text. We call α=k/m the "error level.” It gives a measure of the "fraction" of the pattern that can be altered.
2001
- (Navarro, 2001) ⇒ Gonzalo Navarro. (2001). “A Guided Tour to Approximate String Matching.” In: ACM Computing Surveys (CSUR), 33(1). doi:10.1145/375360.375365
- QUOTE: This work focuses on the problem of string matching that allows errors, also called approximate string matching. The general goal is to perform string matching of a pattern in a text where one or both of them have suffered some kind of (undesirable) corruption. Some examples are recovering the original signals after their transmission over noisy channels, finding DNA subsequences after possible mutations, and text searching where there are typing or spelling errors.
1995
- (Zobel & Dart, 1995) ⇒ J. Zobel, P. Dart. (1995). “Finding Approximate Matches in Large Lexicons.” In: Software-Practice & Experience 25(3), pp 331–345.
1985
- (Ukkonen, 1985) ⇒ E. Ukkonen. (1985). “Algorithms for Approximate String Matching.” In: Information and Control, 64.
1974
- (Wagner & Fischer, 1974) ⇒ Robert A. Wagner, and Michael J. Fischer. (1974). “The String to String Correction Problem.” In: Journal of the ACM (JACM), 21(1).