Linear Least-Squares Classification Algorithm

A Linear Least-Squares Classification Algorithm is a Learning Algorithm that uses a Linear Least-Squares Algorithm to solve a Text Classification Task.

AKA: Linear Least-Squares Fit (LLSF) Algorithm.
Example(s):
- …
Counter-Example(s):
- Least-Squares Function Fitting Algorithm,
- Linear Least-Squares Regression Algorithm.
See: Classification Algorithm, Logistic Regression Algorithm, Text Segmentation Algorithm, Text Classification Algorithm.

References

1992

(Yang & Chute,1992) ⇒ Yiming Yang, and Christopher G. Chute (1992). "An application of least squares fit mapping to clinical classification". In: Proceedings of the Annual Symposium on Computer Application in Medical Care (p. 460). American Medical Informatics Association. PMID: 1482917 DOI: 10.1145/160688.160738
- QUOTE: We use a large collection of human-assigned matches between texts and categories as a “training set", to compute word-to-category connections. An LLSF technique enables us to weight these connections in such a way as to optimally fit the training set and to probabilistically capture the correct categories for arbitrary texts.
  (...) Our focus is to obtain a mapping function which determines the correct or "nearly correct" category for an arbitrary text. In mathematics, there are well-established numerical methods for computing unknown functions using known data points. Applying this idea to the classification problem, we first need a numerical representation for texts and categories. Vectors and matrices have been used by information retrieval systems (...) where words are generally used as dimensions and a text is represented as a point in that multi-dimensional vector space.
  (...) In our model, we use two matrices to represent a training set (...) Matrix A represents the texts, where rows correspond to words, columns are texts, and cells contain the numbers of times a word appears in the corresponding text. Matrix B indicates the matched categories of the texts in A using the corresponding columns. The rows of B are unique categories, the columns contain elements of 1 or 0, indicating whether a category is a match of the corresponding column (text) in A.
  (...) Having matrices A and B we can compute a mapping function which projects a point (vector) in the source space to a point in the target space. The function is a transformation matrix, called W, as defined below in the LLSF problem.
  Definition 1. The LLSF problem is to find W which minimizes the sum
  [math]\displaystyle{ \sum_{i=1}^k\parallel \vec{e_i}\parallel^2_2=\sum_{i=1}^k\parallel W\vec{a_i}-\vec{b_i}\parallel^2_2=\sum_{i=1}^k\parallel WA-B\parallel^2_F }[/math]
  where [math]\displaystyle{ a_i }[/math] is an [math]\displaystyle{ n \times 1 }[/math] text vector, [math]\displaystyle{ b_i }[/math] is an [math]\displaystyle{ m \times 1 }[/math] category vector, [math]\displaystyle{ A_{n\times k}= [\vec{a_1},\vec{a_2},\cdots \vec{a_k}] }[/math], [math]\displaystyle{ B_{m \times k} = [\vec{b_1},\vec{b_2},\cdots,\vec{b_i}] }[/math], [math]\displaystyle{ a_i }[/math] and [math]\displaystyle{ b_i }[/math] are a matched pair, [math]\displaystyle{ \vec{e}_i\overset{\mathrm{def}}{=} W\vec{a_i}- \vec{b_i} }[/math] is the mapping error of the i-th pair,
  [math]\displaystyle{ \parallel\cdots\parallel_2\overset{\mathrm{def}}{=}\sqrt{\sum_{j=1}^m v_j^2} }[/math]
  is vector 2-norm of an [math]\displaystyle{ m \times 1 }[/math] vector, and
  [math]\displaystyle{ \parallel\cdots\parallel_F\overset{\mathrm{def}}{=}\sqrt{\sum_{i=1}^k\sum_{j=1}^m m_{ij}^2} }[/math]
  is the Frobenius matrix norm of an [math]\displaystyle{ m \times k }[/math] matrix.
  The LLSF problem always has at least one solution (...)