AdaBoost Classification System

An AdaBoost Classification System is a AdaBoost System that implements a AdaBoost Classification Algorithm to solve a AdaBoost Classification Task.

AKA: AdaBoost Classifier.
Context:
- It is a AdaBoost System for solving a classification problem.
- It can solve Decision Tree Ensemble Learning Tasks and AdaBoost Classification Tasks by implementing a AdaBoost Algorithms such as AdaBoost-SAMME and AdaBoost-SAMME.R.
- It can range from being a Binary Classification System, to being a Multiclass Classification System.
- …
Example(s)
- sklearn.ensemble.AdaBoostClassifier
Counter-Example(s):
- AdaBoost Regressor.
See: Decision Tree, Ensembled-based Learning Task, Classification Task, Regression Task.

References

2017a

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html Retrieved: 2017-10-29.
- QUOTE: An AdaBoost [1] classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.
  This class implements the algorithm known as AdaBoost-SAMME [2].

2017b

(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/AdaBoost Retrieved:2017-10-22.
- AdaBoost, short for "Adaptive Boosting", is a machine learning meta-algorithm formulated by Yoav Freund and Robert Schapire who won the Gödel Prize in 2003 for their work. It can be used in conjunction with many other types of learning algorithms to improve their performance. The output of the other learning algorithms ('weak learners') is combined into a weighted sum that represents the final output of the boosted classifier. AdaBoost is adaptive in the sense that subsequent weak learners are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. In some problems it can be less susceptible to the overfitting problem than other learning algorithms. The individual learners can be weak, but as long as the performance of each one is slightly better than random guessing (e.g., their error rate is smaller than 0.5 for binary classification), the final model can be proven to converge to a strong learner.
  Every learning algorithm will tend to suit some problem types better than others, and will typically have many different parameters and configurations to be adjusted before achieving optimal performance on a dataset, AdaBoost (with decision trees as the weak learners) is often referred to as the best out-of-the-box classifier. When used with decision tree learning, information gathered at each stage of the AdaBoost algorithm about the relative 'hardness' of each training sample is fed into the tree growing algorithm such that later trees tend to focus on harder-to-classify examples.

2017c

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#AdaBoost Retrieved: 2017-10-22.
- QUOTE: The module sklearn ensemble includes the popular boosting algorithm AdaBoost, introduced in 1995 by Freund and Schapire [FS1995] ^[1].
  The core principle of AdaBoost is to fit a sequence of weak learners (i.e., models that are only slightly better than random guessing, such as small decision trees) on repeatedly modified versions of the data. The predictions from all of them are then combined through a weighted majority vote (or sum) to produce the final prediction. The data modifications at each so-called boosting iteration consist of applying weights [math]\displaystyle{ w_1, w_2,\cdots, w_N }[/math] to each of the training samples. Initially, those weights are all set to [math]\displaystyle{ w_i = 1/N }[/math], so that the first step simply trains a weak learner on the original data. For each successive iteration, the sample weights are individually modified and the learning algorithm is reapplied to the reweighted data. At a given step, those training examples that were incorrectly predicted by the boosted model induced at the previous step have their weights increased, whereas the weights are decreased for those that were predicted correctly. As iterations proceed, examples that are difficult to predict receive ever-increasing influence. Each subsequent weak learner is thereby forced to concentrate on the examples that are missed by the previous ones in the sequence [HTF]^[2].
  AdaBoost can be used both for classification and regression problems:
  - For multi-class classification, AdaBoostClassifier implements AdaBoost-SAMME and AdaBoost-SAMME.R [ZZRH2009]^[3].
  - For regression, AdaBoostRegressor implements AdaBoost.R2 [D1997]^[4].

2009

(Hastie et al., 2009) ⇒ Hastie, T., Rosset, S., Zhu, J., & Zou, H. (2009). Multi-class adaboost. Statistics and its Interface, 2(3), 349-360.
- ABSTRACT: Boosting has been a very successful technique for solving the two-class classification problem. In going from two-class to multi-class classification, most algorithms have been restricted to reducing the multi-class classification problem to multiple two-class problems. In this paper, we develop a new algorithm that directly extends the AdaBoost algorithm to the multi-class case without reducing it to multiple two-class problems. We show that the proposed multi-class AdaBoost algorithm is equivalent to a forward stagewise additive modeling algorithm that minimizes a novel exponential loss for multi-class classification. Furthermore, we show that the exponential loss is a member of a class of Fisher-consistent loss functions for multi-class classification. As shown in the paper, the new algorithm is extremely easy to implement and is highly competitive in terms of misclassification error rate.

↑ Y. Freund, and R. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, 1997
↑ T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning Ed. 2”, Springer, 2009.
↑ J. Zhu, H. Zou, S. Rosset, T. Hastie. “Multi-class AdaBoost”, 2009.
↑ H.Drucker. “Improving Regressors using Boosting Techniques”, 1997.

[1] Y. Freund, and R. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, 1997

[2] T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning Ed. 2”, Springer, 2009.

[3] J. Zhu, H. Zou, S. Rosset, T. Hastie. “Multi-class AdaBoost”, 2009.

[4] H.Drucker. “Improving Regressors using Boosting Techniques”, 1997.

[1]

[2]

[3]

[4]

AdaBoost Classification System

References

2017a

2017b

2017c

2009

Navigation menu

Search