Online Learning System
An Online Learning System is a learning system that implements an online learning algorithm to solve an online learning task.
- AKA: Online Machine Learning System, Sequential Learning System.
- Example(s):
- Counter-Example(s):
- See: Online Education System, Machine Learning System, Semi-Supervised System, Reinforcement Learning System, Supervised Learning System, Active Learning System, Continual Learning System.
References
2017
- (Auer, 2017) ⇒ Auer P. (2017) Online Learning. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: In the online learning model, the learner needs to make predictions or choices about a sequence of instances, one after the other, and receives a loss or reward after each prediction or choice. Typically, the learner receives a description of the current instance before making a prediction. The goal of the learner is to minimize its accumulated losses (or equivalently maximize the accumulated rewards).
The performance of the online learner is usually compared to the best predictor in hindsight from a given class of predictors. This comparison with a predictor in hindsight allows for meaningful performance bounds even without any assumptions on how the sequence of instances is generated. In particular, this sequence of instances may not be generated by a random process but by an adversary that tries to prevent learning.
In this sense performance bounds for online learning are typically worst-case bounds that hold for any sequence of instances. This is possible since the performance bounds are relative to the best predictor from a given class. Often these performance guarantees are quite strong, showing that the learner can do nearly as well as the best predictor from a large class of predictors.
- QUOTE: In the online learning model, the learner needs to make predictions or choices about a sequence of instances, one after the other, and receives a loss or reward after each prediction or choice. Typically, the learner receives a description of the current instance before making a prediction. The goal of the learner is to minimize its accumulated losses (or equivalently maximize the accumulated rewards).
2006
- (Softman et al., 2006) ⇒ Boris Sofman, Ellie Lin, J. Andrew Bagnell, John Cole, Nicolas Vandapel, and Anthony Stentz. (2006). “Improving robot navigation through self-supervised online learning.” doi:10.1002/rob.20169