Similarity Join Query

References

(Augsten & Böhlen, 2013) ⇒ Nikolaus Augsten, and Michael H Böhlen. (2013). “Similarity Joins in Relational Database Systems.” In: Synthesis Lectures on Data Management Journal, 5(5). doi:10.2200/S00544ED1V01Y201310DTM038
- QUOTE: Token-based distances are used to compute an approximation of the edit distance and prune expensive edit distance calculations. A key observation when computing similarity joins is that many of the object pairs, for which the similarity is computed, are very different from each other. Filters exploit this property to improve the performance of similarity joins. A filter preprocesses the input data sets and produces a set of candidate pairs.