Kernel Matrix

From GM-RKB
(Redirected from Gram matrix)
Jump to navigation Jump to search

A Kernel Matrix is a symmetric and positive semidefinite matrix that encodes the relative positions of all points.



References

2018a

2018b

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Kernel_method#Mathematics:_the_kernel_trick Retrieved:2018-8-10.
    • Theoretically, a Gram matrix [math]\displaystyle{ \mathbf{K} \in \mathbb{R}^{n \times n} }[/math] with respect to [math]\displaystyle{ \{\mathbf{x}_1, \dotsc, \mathbf{x}_n\} }[/math] (sometimes also called a "kernel matrix" ), where [math]\displaystyle{ K_{ij} = k(\mathbf{x}_i, \mathbf{x}_j) }[/math] , must be positive semi-definite (PSD). Empirically, for machine learning heuristics, choices of a function [math]\displaystyle{ k }[/math] that do not satisfy Mercer's condition may still perform reasonably if [math]\displaystyle{ k }[/math] at least approximates the intuitive idea of similarity. Regardless of whether [math]\displaystyle{ k }[/math] is a Mercer kernel, [math]\displaystyle{ k }[/math] may still be referred to as a "kernel". If the kernel function [math]\displaystyle{ k }[/math] is also a covariance function as used in Gaussian processes, then the Gram matrix [math]\displaystyle{ \mathbf{K} }[/math] can also be called a covariance matrix.

2018c

  • (ML Wiki, 2018) ⇒ http://mlwiki.org/index.php/Gram_Matrices Retrieved:2018-8-10.
    • QUOTE: A Gram matrix of vectors </math>a_1, \cdots ,a_n</math> is a matrix [math]\displaystyle{ G }[/math]
      • s.t. [math]\displaystyle{ G=\langle a_i,a_j\rangle }[/math] for all [math]\displaystyle{ i,j }[/math]
      • if vectors [math]\displaystyle{ a_1,\cdots ,a_n }[/math] are columns of a matrix [math]\displaystyle{ A }[/math], then [math]\displaystyle{ G=A^TA }[/math]
      • a Gram matrix is Positive Definite and Symmetric.
      • if vectors [math]\displaystyle{ a_1, \cdots ,a_n }[/math] are the rows of A (A would be so-called “Data Matrix"), then [math]\displaystyle{ G=AA^T }[/math], and it's called left Gram matrix.

2018d

2018e

2017a

2017b

2016

2015

2007

  • (Nguyen & Ho, 2007) ⇒ Canh Hao Nguyen, and Tu Bao Ho (2007, January). "Kernel Matrix Evaluation". In IJCAI (pp. 987-992).
    • QUOTE: Training example set [math]\displaystyle{ \{x_i\}_{i=1,\cdots, n} \subset X }[/math] with the corresponding target vector [math]\displaystyle{ y=\{ y_i \}^T_{i=1, \cdots, n} \subset \{−1, 1\}^n }[/math]. Suppose that [math]\displaystyle{ y_1 = \cdots = y_{n_+} = 1 }[/math] and [math]\displaystyle{ y_{n_++1} = .. = y_{n_++n_−} = −1;\; n_+ }[/math] examples belong to class [math]\displaystyle{ 1,\; n_− }[/math] examples belong to class [math]\displaystyle{ −1,\; n_+ + n_− = n }[/math]. Under a feature map [math]\displaystyle{ \phi }[/math], the kernel matrix is defined as:

      [math]\displaystyle{ K = \{k_{ij} = \langle \phi(x_i), \phi(x_j)\rangle\}_{i=1, \cdots, n,\;j=1,\cdots, n} }[/math]

2004



  1. Theorem 7.2.10 Let [math]\displaystyle{ v_1,\ldots,v_m }[/math] be vectors in an inner product space with inner product [math]\displaystyle{ \langle{\cdot,\cdot}\rangle }[/math] and let [math]\displaystyle{ G = [\langle{v_j,v_i}\rangle]_{i,j=1}^m \in M_m }[/math] . Then
    (a) is Hermitian and positive-semidefinite
    (b) is positive-definite if and only if the vectors [math]\displaystyle{ v_1,\ldots,v_m }[/math] are linearly-independent.
    (c) [math]\displaystyle{ \operatorname{rank}G=\dim\operatorname{span}\{v_1,\ldots,v_m\} }[/math]