Distance Metric Learning Task

From GM-RKB
(Redirected from metric learning)
Jump to navigation Jump to search

A Distance Metric Learning Task is a distance function creation task that is a function training task (to train a distance function).



References

2015

  • (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/similarity_learning#Metric_learning Retrieved:2015-5-11.
    • Similarity learning is closely related to distance metric learning. Metric learning is the task of learning a distance function over objects. A metric or distance function has to obey four axioms: non-negativity, Identity of indiscernibles, symmetry and subadditivity / triangle inequality. In practice, metric learning algorithms ignore the condition of identity of indiscernibles and learn a pseudo-metric.

      When the objects [math]\displaystyle{ x_i }[/math] are vectors in [math]\displaystyle{ R^d }[/math], then any matrix [math]\displaystyle{ W }[/math] in the symmetric positive semi-definite cone [math]\displaystyle{ S_+^d }[/math] defines a distance pseudo-metric of the space of x through the form [math]\displaystyle{ D_W(x_1, x_2)^2 = (x_1-x_2)^{\top} W (x_1-x_2) }[/math] . When [math]\displaystyle{ W }[/math] is a symmetric positive definite matrix, [math]\displaystyle{ D_W }[/math] is a metric. Moreover, as any symmetric positive semi-definite matrix [math]\displaystyle{ W \in S_+^d }[/math] can be decomposed as [math]\displaystyle{ W = L^{\top}L }[/math] where [math]\displaystyle{ L \in R^{e \times d} }[/math] and [math]\displaystyle{ e \geq rank(W) }[/math], the distance function [math]\displaystyle{ D_W }[/math] can be rewritten equivalently [math]\displaystyle{ D_W(x_1, x_2)^2 = (x_1-x_2)^{\top} L^{\top}L (x_1-x_2) = \| L (x_1-x_2) \|_2^2 }[/math] . The distance [math]\displaystyle{ D_W(x_1, x_2)^2=\| x_1' - x_2' \|_2^2 }[/math] corresponds to the Euclidean distance between the projected feature vectors [math]\displaystyle{ x_1'= Lx_1 }[/math] and [math]\displaystyle{ x_2'= Lx_2 }[/math] .

      Some well-known approaches for metric learning include Large margin nearest neighbor

      , Information theoretic metric learning (ITML).

      In statistics, the covariance matrix of the data is sometimes used to define a distance metric called Mahalanobis distance.


2014

2010

2008

  • (Xiang et al., 2008) ⇒ Shiming Xiang, Feiping Nie, and Changshui Zhang. (2008). “Learning a Mahalanobis Distance Metric for Data Clustering and Classification.” In: Pattern Recognition 41.
    • There has been considerable research on distance metric learning over the past few years [14]. One family of algorithms are developed with known class labels of training data points. Algorithms in this family include the neighboring component analysis [15], large margin nearest neighbor classification [16], large margin component analysis [17], class collapse [18], and other extension work [19,20]. The success in a variety of problems shows that the learned distance metric yields substantial improvements over the commonly used Euclidean distance metric [15–18]. However, class label may be strong information from the users and cannot be easily obtained in some real-world situations. In contrast, it is more natural to specify which pairs of data points are similar or dissimilar. Such pairwise constraints appear popularly in many applications. For example, in image retrieval the similar and dissimilar images to the query one are labeled by the user and such image pairs can be used to learn a distance metric [21]. Accordingly, another family of distance metric learning algorithms are developed to make use of such pairwise constraints [14,21–29]. Pairwise constraint is a kind of side information [22]. One popular form of side information is must-links and cannot-links [22,30–35]. A must-link indicates the pair of data points must be in a same class, while a cannot-link indicates that the two data points must be in two different classes. Another popular form is the relative comparison with “A is closer to B than A is to C” [26]. The utility of pairwise constraints has been demonstrated in many applications, indicating that significantly improvement of the algorithm can be achieved [21–27].

2006

2004