AdaBoost Algorithm
An AdaBoost Algorithm is a boosted trees algorithm that is an iterative boosting algorithm where each iteration assigns a weight to each learning record equal to the current error [math]\displaystyle{ E(F_{t-1}(x_i)) }[/math] on that record.
- AKA: AdaBoost.
- Context:
- It can be applied by an AdaBoost System (to solve an AdaBoost task).
- Example(s):
- Counter-Example(s):
- a Bagged Trees Algorithm, such as Random Forests.
- See: Wagging Algorithm, Ensemble Algorithm.
References
2014
- http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Classification/adaboost#Algorithm
- While boosting has evolved somewhat over the years, we describe the most commonly used version of the AdaBoost procedure (Freund and Schapire - 1996) which we call Discrete AdaBoost. This is essentially the same as AdaBoost.M1 for binary data in Freund and Schapire. Here is a concise description of AdaBoost in the two-class classification setting. We have training data [math]\displaystyle{ (x_1,y_1), … , (x_n,y_n) }[/math] with [math]\displaystyle{ x_i }[/math] a vector valued feature and [math]\displaystyle{ y_i = -1 }[/math] or 1. We define [math]\displaystyle{ F(x) = \sum_{1}^{M} c_mf_m }[/math] where each [math]\displaystyle{ f_m(x) }[/math] is a classifier producing values plus or minus 1 and [math]\displaystyle{ c_m }[/math] are constants; the corresponding prediction is sign [math]\displaystyle{ (F(x)) }[/math]. The AdaBoost trains the classifiers f_m(x) on weighted versions of the training sample, giving higher weight to cases that are currently misclassified. This is done for a sequence of weighted samples, and then the final classifier is defined to be a linear combination of the classifiers from each stage.
2013
- http://en.wikipedia.org/wiki/AdaBoost
- AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire.[1] It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. In some problems, however, it can be less susceptible to the overfitting problem than most learning algorithms. The classifiers' uses can be weak (i.e., display a substantial error rate), but as long as their performance is slightly better than random (i.e. their error rate is smaller than 0.5 for binary classification), they will improve the final model. Even classifiers with an error rate higher than would be expected from a random classifier will be useful, since they will have negative coefficients in the final linear combination of classifiers and hence behave like their inverses.
AdaBoost generates and calls a new weak classifier in each of a series of rounds [math]\displaystyle{ t = 1,\ldots,T }[/math]. For each call, a distribution of weights [math]\displaystyle{ D_{t} }[/math] is updated that indicates the importance of examples in the data set for the classification. On each round, the weights of each incorrectly classified example are increased, and the weights of each correctly classified example are decreased, so the new classifier focuses on the examples which have so far eluded correct classification.
- AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire.[1] It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. In some problems, however, it can be less susceptible to the overfitting problem than most learning algorithms. The classifiers' uses can be weak (i.e., display a substantial error rate), but as long as their performance is slightly better than random (i.e. their error rate is smaller than 0.5 for binary classification), they will improve the final model. Even classifiers with an error rate higher than would be expected from a random classifier will be useful, since they will have negative coefficients in the final linear combination of classifiers and hence behave like their inverses.
2011
- (Sammut & Webb, 2011) ⇒ Claude Sammut (editor), and Geoffrey I. Webb (editor). (2011). “AdaBoost.” In: (Sammut & Webb, 2011) p.19
2009
- (Wu & Kumar, 2009) ⇒ Xindong Wu, and Vipin Kumar, editors. (2009). “The Top Ten Algorithms in Data Mining.” Chapman & Hall. ISBN:1420089641
2002
- (Carreras et al., 2002) ⇒ Xavier Carreras, Lluis Marques, and Lluis Padro. (2002). “Named Entity Extraction Using Adaboost.” In: Proceedings of CoNLL 2002.
2001
- (Rätsch et al., 2001) ⇒ G. Rätsch, T. Onoda, and K.R. Müller, (2001). “Soft margins for AdaBoost.” In: Machine Learning, 42(3). doi:10.1023/A:1007618119488
1999
- (Schapire & Singer, 1999) ⇒ Robert E. Schapire, and Yoram Singer. (1999). “Improved Boosting Algorithms Using Confidence-rated Predictions.” In: Machine Learning, 37(3). doi:10.1023/A:1007614523901
1997
- (Margineantu & Dietterich, 1997) ⇒ Dragos D. Margineantu, and Thomas G. Dietterich. (1997). “Pruning Adaptive Boosting.” In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997).
1996
- (Freund & Schapire, 1996) ⇒ Yoav Freund, and Robert E. Schapire. (1996). “Experiments With a New Boosting Algorithm.” In: Proceedings of the International Conference on Machine Learning (ICML 1996).