2011 FastCoordinateDescentMethodswit

(Hsieh & Dhillon, 2011) ⇒ Cho-Jui Hsieh, and Inderjit S. Dhillon. (2011). “Fast Coordinate Descent Methods with Variable Selection for Non-negative Matrix Factorization.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020577

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Algorithms; constrained optimization; convergence; coordinate descent method; experimentation; learning; non-negative matrix factorization; performance

Abstract

Nonnegative Matrix Factorization (NMF) is an effective dimension reduction method for non-negative dyadic data, and has proven to be useful in many areas, such as text mining, bioinformatics and image processing. NMF is usually formulated as a constrained non-convex optimization problem, and many algorithms have been developed for solving it. Recently, a coordinate descent method, called FastHals, has been proposed to solve least squares NMF and is regarded as one of the state-of-the-art techniques for the problem. In this paper, we first show that FastHals has an inefficiency in that it uses a cyclic coordinate descent scheme and thus, performs unneeded descent steps on unimportant variables. We then present a variable selection scheme that uses the gradient of the objective function to arrive at a new coordinate descent method. Our new method is considerably faster in practice and we show that it has theoretical convergence guarantees. Moreover when the solution is sparse, as is often the case in real applications, our new method benefits by selecting important variables to update more often, thus resulting in higher speed. As an example, on a text dataset RCV1, our method is 7 times faster than FastHals, and more than 15 times faster when the sparsity is increased by adding an L1 penalty. We also develop new coordinate descent methods when error in NMF is measured by KL-divergence by applying the Newton method to solve the one-variable sub-problems. Experiments indicate that our algorithm for minimizing the KL-divergence is faster than the Lee & Seung multiplicative rule by a factor of 10 on the CBCL image dataset.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2011 FastCoordinateDescentMethodswit	Inderjit S. Dhillon Cho-Jui Hsieh			Fast Coordinate Descent Methods with Variable Selection for Non-negative Matrix Factorization				10.1145/2020408.2020577		2011