Decision Tree Ensemble Learning Classifier
A Decision Tree Ensemble Learning Classifier is a Decision Tree Ensemble Learning System for solving a classification problem.
- AKA: Decision Tree Ensemble Learning Classification System.
- Context:
- It can solve both Decision Tree Ensemble Learning Task and Classification Tree Learning Task.
- …
- Example(s):
- Counter-Example(s)
- See: Decision Tree, Ensembled-based Learning Task, Classification Task, Regression Task.
References
- (Scikit Learn, 2017) ⇒ "Ensemble Methods" in http://scikit-learn.org/stable/modules/ensemble.html Retrieved: 2017-10-15
- QUOTE: The goal of ensemble methods is to combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability/robustness over a single estimator.
Two families of ensemble methods are usually distinguished:
In averaging methods, the driving principle is to build several estimators independently and then to average their predictions. On average, the combined estimator is usually better than any of the single base estimator because its variance is reduced.
Examples: Bagging methods, Forests of randomized trees, …
By contrast, in boosting methods, base estimators are built sequentially and one tries to reduce the bias of the combined estimator. The motivation is to combine several weak models to produce a powerful ensemble.
Examples: AdaBoost, Gradient Tree Boosting, …
- QUOTE: The goal of ensemble methods is to combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability/robustness over a single estimator.
2017b
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Decision_tree_learning#Decision_tree_types Retrieved:2017-10-15.
- Decision trees used in data mining are of two main types:
- Classification tree analysis is when the predicted outcome is the class to which the data belongs.
- Regression tree analysis is when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient's length of stay in a hospital).
- The term Classification And Regression Tree (CART) analysis is an umbrella term used to refer to both of the above procedures, first introduced by Breiman et al.[1] Trees used for regression and trees used for classification have some similarities - but also some differences, such as the procedure used to determine where to split.
Some techniques, often called ensemble methods, construct more than one decision tree:
- Boosted trees Incrementally building an ensemble by training each new instance to emphasize the training instances previously mis-modeled. A typical example is AdaBoost. These can be used for regression-type and classification-type problems. [2] [3]
- Bootstrap aggregated (or bagged) decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling training data with replacement, and voting the trees for a consensus prediction. [4]
- A random forest classifier is a specific type of bootstrap aggregating
- Rotation forest - in which every decision tree is trained by first applying principal component analysis (PCA) on a random subset of the input features. [5]
A special case of a decision tree is a decision list, which is a one-sided decision tree, so that every internal node has exactly 1 leaf node and exactly 1 internal node as a child (except for the bottommost node, whose only child is a single leaf node). While less expressive, decision lists are arguably easier to understand than general decision trees due to their added sparsity, permit non-greedy learning methods and monotonic constraints to be imposed.
- Decision trees used in data mining are of two main types:
- ↑ Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
- ↑ Friedman, J. H. (1999). Stochastic gradient boosting. Stanford University.
- ↑ Hastie, T., Tibshirani, R., Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer Verlag.
- ↑ Breiman, L. (1996). Bagging Predictors. “Machine Learning, 24": pp. 123-140.
- ↑ Rodriguez, J.J. and Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-1630.