AdaBoost Classification Algorithm: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - "]]↵*" to "]]. *")
m (Text replacement - "<P> [[" to "<P>  [[")
 
Line 28: Line 28:
=== 2017c ===
=== 2017c ===
* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#AdaBoost Retrieved: 2017-10-22.
* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#AdaBoost Retrieved: 2017-10-22.
** QUOTE: The module [[sklearn.ensemble]] includes the popular [[boosting algorithm]] [[AdaBoost]], introduced in 1995 by Freund and Schapire [FS1995] <ref>Y. Freund, and R. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, 1997</ref>.        <P>        The core principle of [[AdaBoost]] is to fit a sequence of [[weak learner]]s (i.e., models that are only slightly better than [[random guessing]], such as small [[decision tree]]s) on repeatedly modified versions of the data. The [[prediction]]s from all of them are then combined through a [[weighted majority vote]] (or sum) to produce the final [[prediction]]. The [[data]] modifications at each so-called [[boosting iteration]] consist of applying [[weight]]s <math>w_1, w_2,\cdots, w_N</math> to each of the [[training sample]]s. Initially, those [[weight]]s are all set to <math>w_i = 1/N</math>, so that the first step simply trains a [[weak learner]] on the original data. For each [[successive iteration]], the [[sample weight]]s are individually modified and the [[learning algorithm]] is reapplied to the reweighted data. At a given step, those training examples that were incorrectly predicted by the [[boosted model]] induced at the previous step have their weights increased, whereas the weights are decreased for those that were predicted correctly. As iterations proceed, examples that are difficult to predict receive ever-increasing influence. Each subsequent weak learner is thereby forced to concentrate on the examples that are missed by the previous ones in the sequence [HTF]<ref>T. Hastie, R. Tibshirani and J. Friedman, [https://pdfs.semanticscholar.org/cd43/83e8a520330588292bfece8b2d871195235b.pdf “Elements of Statistical Learning Ed. 2”], Springer, 2009.</ref>.        <P>       [[AdaBoost]] can be used both for [[classification]] and [[regression]] problems:
** QUOTE: The module [[sklearn.ensemble]] includes the popular [[boosting algorithm]] [[AdaBoost]], introduced in 1995 by Freund and Schapire [FS1995] <ref>Y. Freund, and R. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, 1997</ref>.        <P>        The core principle of [[AdaBoost]] is to fit a sequence of [[weak learner]]s (i.e., models that are only slightly better than [[random guessing]], such as small [[decision tree]]s) on repeatedly modified versions of the data. The [[prediction]]s from all of them are then combined through a [[weighted majority vote]] (or sum) to produce the final [[prediction]]. The [[data]] modifications at each so-called [[boosting iteration]] consist of applying [[weight]]s <math>w_1, w_2,\cdots, w_N</math> to each of the [[training sample]]s. Initially, those [[weight]]s are all set to <math>w_i = 1/N</math>, so that the first step simply trains a [[weak learner]] on the original data. For each [[successive iteration]], the [[sample weight]]s are individually modified and the [[learning algorithm]] is reapplied to the reweighted data. At a given step, those training examples that were incorrectly predicted by the [[boosted model]] induced at the previous step have their weights increased, whereas the weights are decreased for those that were predicted correctly. As iterations proceed, examples that are difficult to predict receive ever-increasing influence. Each subsequent weak learner is thereby forced to concentrate on the examples that are missed by the previous ones in the sequence [HTF]<ref>T. Hastie, R. Tibshirani and J. Friedman, [https://pdfs.semanticscholar.org/cd43/83e8a520330588292bfece8b2d871195235b.pdf “Elements of Statistical Learning Ed. 2”], Springer, 2009.</ref>.        <P>         [[AdaBoost]] can be used both for [[classification]] and [[regression]] problems:
*** For [[multi-class classification]], [[AdaBoostClassifier]] implements [[AdaBoost-SAMME]] and [[AdaBoost-SAMME.R]] [ZZRH2009]<ref>J. Zhu, H. Zou, S. Rosset, T. Hastie. [http://www.web.stanford.edu/~hastie/Papers/SII-2-3-A8-Zhu.pdf “Multi-class AdaBoost”], 2009.</ref>.
*** For [[multi-class classification]], [[AdaBoostClassifier]] implements [[AdaBoost-SAMME]] and [[AdaBoost-SAMME.R]] [ZZRH2009]<ref>J. Zhu, H. Zou, S. Rosset, T. Hastie. [http://www.web.stanford.edu/~hastie/Papers/SII-2-3-A8-Zhu.pdf “Multi-class AdaBoost”], 2009.</ref>.
*** For regression, [[AdaBoostRegressor]] implements [[AdaBoost.R2]] [D1997]<ref>H.Drucker. “Improving Regressors using Boosting Techniques”, 1997.</ref>.
*** For regression, [[AdaBoostRegressor]] implements [[AdaBoost.R2]] [D1997]<ref>H.Drucker. “Improving Regressors using Boosting Techniques”, 1997.</ref>.

Latest revision as of 01:43, 27 February 2024

An AdaBoost Classification Algorithm is a AdaBoost Algorithm that can solve AdaBoost Classification Task.



References

2017a

2017b

  • (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/AdaBoost Retrieved:2017-10-22.
    • AdaBoost, short for "Adaptive Boosting", is a machine learning meta-algorithm formulated by Yoav Freund and Robert Schapire who won the Gödel Prize in 2003 for their work. It can be used in conjunction with many other types of learning algorithms to improve their performance. The output of the other learning algorithms ('weak learners') is combined into a weighted sum that represents the final output of the boosted classifier. AdaBoost is adaptive in the sense that subsequent weak learners are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. In some problems it can be less susceptible to the overfitting problem than other learning algorithms. The individual learners can be weak, but as long as the performance of each one is slightly better than random guessing (e.g., their error rate is smaller than 0.5 for binary classification), the final model can be proven to converge to a strong learner.

      Every learning algorithm will tend to suit some problem types better than others, and will typically have many different parameters and configurations to be adjusted before achieving optimal performance on a dataset, AdaBoost (with decision trees as the weak learners) is often referred to as the best out-of-the-box classifier. When used with decision tree learning, information gathered at each stage of the AdaBoost algorithm about the relative 'hardness' of each training sample is fed into the tree growing algorithm such that later trees tend to focus on harder-to-classify examples.

2017c

2009


  1. Y. Freund, and R. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”, 1997
  2. T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning Ed. 2”, Springer, 2009.
  3. J. Zhu, H. Zou, S. Rosset, T. Hastie. “Multi-class AdaBoost”, 2009.
  4. H.Drucker. “Improving Regressors using Boosting Techniques”, 1997.