2011 LearningtoTradeOffBetweenExplor
- (Valizadegan et al., 2011) ⇒ Hamed Valizadegan, Rong Jin, and Shijun Wang. (2011). “Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit Prediction.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020445
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222011%22+Learning+to+Trade+Off+Between+Exploration+and+Exploitation+in+Multiclass+Bandit+Prediction
- http://dl.acm.org/citation.cfm?id=2020408.2020445&preflayout=flat#citedby
Quotes
Author Keywords
- Algorithms; bandit feedback; exploration vs. exploitation; multi-class classification; online learning; parameter learning; theory
Abstract
We study multi-class bandit prediction, an online learning problem where the learner only receives a partial feedback in each trial indicating whether the predicted class label is correct. The exploration vs. exploitation tradeoff strategy is a well-known technique for online learning with incomplete feedback (i.e., bandit setup). Banditron [8], a multi-class online learning algorithm for bandit setting, maximizes the run-time gain by balancing between exploration and exploitation with a fixed tradeoff parameter. The performance of Banditron can be quite sensitive to the choice of the tradeoff parameter and therefore effective algorithms to automatically tune this parameter is desirable. In this paper, we propose three learning strategies to automatically adjust the tradeoff parameter for Banditron. Our extensive empirical study with multiple real-world data sets verifies the efficacy of the proposed approach in learning the exploration vs. exploitation tradeoff parameter.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 LearningtoTradeOffBetweenExplor | Rong Jin Hamed Valizadegan Shijun Wang | Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit Prediction | 10.1145/2020408.2020445 | 2011 |