Nearest Neighbor Search (NNS) Task
Jump to navigation
Jump to search
A Nearest Neighbor Search (NNS) Task is a similarity search task that requires finding item records in a metric space that are near a query point.
- Context:
- Input:
- a Metric Space [math]\displaystyle{ M }[/math] (with point space [math]\displaystyle{ P }[/math] and distance function [math]\displaystyle{ d }[/math]).
- a Query Point [math]\displaystyle{ q \in P }[/math]
- a Positive Integer [math]\displaystyle{ k }[/math].
- output: the [math]\displaystyle{ k }[/math] points in [math]\displaystyle{ P }[/math] closest to q.
- It can range from being a 1-Nearest Neighbor Search Task to being a k-Nearest Neighbor Search Task.
- It can range from being an Exact Nearest Neighbor Search Task to being an Approximate Nearest Neighbor Search Task.
- It can be solved by a Nearest Neighbor Search System (that implements a nearest neighbor algorithm).
- …
- Input:
- Example(s):
- “Where is the nearest post office?” (post-office problem).
- “What are the four restaurants closest to my home?".
- “What movies are most similar to movie X?".
- “Solve the supervised classification C using a nearest neighbor algorithm, include your source code?".
- “Identify the most similar historical stock patterns to today's market for predicting future movements.".
- “Find the most similar genetic sequences to a given DNA strand in a genetic database.".
- “Determine the closest match to a customer's preference profile in a user database for personalized marketing.".
- …
- Counter-Example(s):
- ...
- See: Reverse Nearest Neighbor Search Task, Sampling-Based Motion Planning Algorithms, Pattern Recognition, Optical Character Recognition, Statistical Classification, k-Nearest Neighbor Algorithm, Computer Vision, Computational Geometry, Closest Pair of Points Problem, Database, Content-Based Image Retrieval, Coding Theory, Decoding Methods.
References
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/Nearest_neighbor_search Retrieved:2020-5-5.
- Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. Formally, the nearest-neighbor (NN) search problem is defined as follows: given a set S of points in a space M and a query point q ∈ M, find the closest point in S to q. Donald Knuth in vol. 3 of The Art of Computer Programming (1973) called it the post-office problem, referring to an application of assigning to a residence the nearest post office. A direct generalization of this problem is a k-NN search, where we need to find the k closest points.
Most commonly M is a metric space and dissimilarity is expressed as a distance metric, which is symmetric and satisfies the triangle inequality. Even more common, M is taken to be the d-dimensional vector space where dissimilarity is measured using the Euclidean distance, Manhattan distance or other distance metric. However, the dissimilarity function can be arbitrary. One example is asymmetric Bregman divergence, for which the triangle inequality does not hold.
- Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. Formally, the nearest-neighbor (NN) search problem is defined as follows: given a set S of points in a space M and a query point q ∈ M, find the closest point in S to q. Donald Knuth in vol. 3 of The Art of Computer Programming (1973) called it the post-office problem, referring to an application of assigning to a residence the nearest post office. A direct generalization of this problem is a k-NN search, where we need to find the k closest points.
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/Nearest_neighbor_search#Applications Retrieved:2020-5-5.
- The nearest neighbour search problem arises in numerous fields of application, including:
- Pattern recognition – in particular for optical character recognition.
- Statistical classification – see k-nearest neighbor algorithm.
- Computer vision.
- Computational geometry – see Closest pair of points problem.
- Databases – e.g. content-based image retrieval.
- Coding theory – see maximum likelihood decoding.
- Data compression – see MPEG-2 standard
- Robotic sensing
- Recommendation systems, e.g. see Collaborative filtering.
- Internet marketing – see contextual advertising and behavioral targeting.
- DNA sequencing.
- Spell checking – suggesting correct spelling
- Plagiarism detection.
- Similarity scores for predicting career paths of professional athletes.
- Cluster analysis – assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense, usually based on Euclidean distance.
- Chemical similarity.
- Sampling-based motion planning
- The nearest neighbour search problem arises in numerous fields of application, including:
2007
- (Zezula et al., 2007) ⇒ Pavel Zezula, Giuseppe Amato, and Vlastislav Dohnal. (2007). “Similarity Search - The Metric Space Approach." Tutorial at ACM SAC 2007.
2006
- (Zezula et al., 2006) ⇒ Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, and Michal Batko. (2006). “Similarity Search: The Metric Space Approach." Springer, Advances in Database Systems.