2013 SpeedingUpLargeScaleLearningwit

Subject Headings:

Notes

Slow convergence and poor initial accuracy are two problems that plague efforts to use very large feature sets in online learning. This is especially true when only a few features are "active" in any training example, and the frequency of activations of different features is skewed. We show how these problems can be mitigated if a graph of relationships between features is known. We study this problem in a fully Bayesian setting, focusing on the problem of using Facebook user-IDs as features, with the social network giving the relationship structure. Our analysis uncovers significant problems with the obvious regularizations, and motivates a two-component mixture-model "social prior” that is provably better. Empirical results on large-scale click prediction problems show that our algorithm can learn as well as the baseline with 12 M fewer training examples, and continuously outperforms it for over 60 M examples. On a second problem using binned features, our model outperforms the baseline even after the latter sees 5x as much data.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2013 SpeedingUpLargeScaleLearningwit	Ralf Herbrich Deepayan Chakrabarti			Speeding Up Large-scale Learning with a Social Prior				10.1145/2487575.2487587		2013