Weighted Matrix Factorization Task
(Redirected from Weighted Matrix Factorization)
Jump to navigation
Jump to search
A Weighted Matrix Factorization Task is a matrix factorization task that ...
References
2014
- (Levy & Goldberg, 2014) ⇒ Omer Levy, and Yoav Goldberg. (2014). “Neural Word Embedding As Implicit Matrix Factorization.” In: Advances in Neural Information Processing Systems.
- QUOTE: ... In this work, we aim to broaden the theoretical understanding of neural-inspired word embeddings. Specifically, we cast SGNS’s training method as weighted matrix factorization, and show that its objective is implicitly factorizing a shifted PMI matrix ...
We conjecture that this behavior is related to the fact that SGNS performs weighted matrix factorization, giving more influence to frequent pairs, as opposed to SVD, which gives the same weight to all matrix cells. While the weighted and non-weighted objectives share the same optimal solution (perfect reconstruction of the shifted PMI matrix), they result in different generalizations when combined with dimensionality constraints.
- QUOTE: ... In this work, we aim to broaden the theoretical understanding of neural-inspired word embeddings. Specifically, we cast SGNS’s training method as weighted matrix factorization, and show that its objective is implicitly factorizing a shifted PMI matrix ...
2012
- (Guo & Diab, 2012) ⇒ Weiwei Guo, and Mona Diab. (2012). “Modeling Sentences in the Latent Space.” In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012).
- QUOTE: The weighted matrix factorization (WMF) approach is very similar to SVD, except that it allows for direct control on each matrix cell [math]\displaystyle{ X_{ij} }[/math]. The model factorizes the original matrix X into two matrices such that [math]\displaystyle{ X \approx P^\text{T} Q }[/math], where P is a K × M matrix, and Q is a K × N matrix (figure 1). The model parameters (vectors in P and Q) are optimized by minimizing the objective function: : [math]\displaystyle{ \Sigma_i \Sigma_j W_{ij}( P_{·,i} · Q_{·,j} - X_{ij})^2 + \lambda \mid\mid P \mid\mid^2_2 + \lambda \mid\mid Q \mid\mid^2_2 \ (3) }[/math] where [math]\displaystyle{ \lambda }[/math] is a free regularization factor, and the weight matrix W defines a weight for each cell in X. Accordingly, P·, i is a K-dimension latent semantics vector profile for word w_i; similarly, Q·, j is the K-dimension vector profile that represents the sentence s_j.
Operations on these K-dimensional vectors have very intuitive semantic meanings:
- QUOTE: The weighted matrix factorization (WMF) approach is very similar to SVD, except that it allows for direct control on each matrix cell [math]\displaystyle{ X_{ij} }[/math]. The model factorizes the original matrix X into two matrices such that [math]\displaystyle{ X \approx P^\text{T} Q }[/math], where P is a K × M matrix, and Q is a K × N matrix (figure 1). The model parameters (vectors in P and Q) are optimized by minimizing the objective function: : [math]\displaystyle{ \Sigma_i \Sigma_j W_{ij}( P_{·,i} · Q_{·,j} - X_{ij})^2 + \lambda \mid\mid P \mid\mid^2_2 + \lambda \mid\mid Q \mid\mid^2_2 \ (3) }[/math] where [math]\displaystyle{ \lambda }[/math] is a free regularization factor, and the weight matrix W defines a weight for each cell in X. Accordingly, P·, i is a K-dimension latent semantics vector profile for word w_i; similarly, Q·, j is the K-dimension vector profile that represents the sentence s_j.
2003
- (Srebro & Jaakkola, 2003) ⇒ Nathan Srebro, and Tommi Jaakkola. (2003). “Weighted Low-rank Approximations.” In: Proceedings of ICML (ICML 2003).
- QUOTE: We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and efficient (EM) algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in addition, the nature of locally optimal solutions that arise in this context, demonstrate the utility of accommodating the weights in reconstructing the underlying low-rank representation, and extend the formulation to non-Gaussian noise models such as logistic models. Finally, we apply the methods developed to a collaborative filtering task.