Gradient Tree Boosting System

A Gradient Tree Boosting System is a decision tree ensemble learning system that applies a Gradient Tree Boosting algorithm to solve a Gradient Tree Boosting task.

AKA: Gradient Boosted Trees System.
Context:
- It can range from being a Gradient Tree Boosting Classification System to being a Gradient Tree Boosting Regression System.
- …
Example(s):
- sklearn.ensemble.GradientBoostingRegressor[1]
  - Gradient Boosting regression
- sklearn.ensemble.GradientBoostingClassifier[2]
  - Gradient Boosting regularization
Counter-Example(s):
- An AdaBoost System.
- A Random Forests System.
- A Bagged Trees System.
- A Rotation Forests System.
- An Extremely Randomized Trees System.
See: Naive-Bayes Training System.

References

2017a

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#gradient-tree-boosting Retrieved:2017-10-22.
- QUOTE: Gradient Tree Boosting or Gradient Boosted Regression Trees (GBRT) is a generalization of boosting to arbitrary differentiable loss functions. GBRT is an accurate and effective off-the-shelf procedure that can be used for both regression and classification problems. Gradient Tree Boosting models are used in a variety of areas including Web search ranking and ecology.
  The advantages of GBRT are:
  - Natural handling of data of mixed type (= heterogeneous features)
  - Predictive power
  - Robustness to outliers in output space (via robust loss functions)

The disadvantages of GBRT are:

Scalability, due to the sequential nature of boosting it can hardly be parallelized.

The module sklearn.ensemble provides methods for both classification and regression via gradient boosted regression trees.

2017b

(Wikipedia, 2017B) ⇒ https://en.wikipedia.org/wiki/Decision_tree_learning#Decision_tree_types Retrieved:2017-10-15.
- Decision trees used in data mining are of two main types:
  - Classification tree analysis is when the predicted outcome is the class to which the data belongs.
  - Regression tree analysis is when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient's length of stay in a hospital).
- The term Classification And Regression Tree (CART) analysis is an umbrella term used to refer to both of the above procedures, first introduced by Breiman et al.^[1] Trees used for regression and trees used for classification have some similarities - but also some differences, such as the procedure used to determine where to split.
  Some techniques, often called ensemble methods, construct more than one decision tree:
  - Boosted trees Incrementally building an ensemble by training each new instance to emphasize the training instances previously mis-modeled. A typical example is AdaBoost. These can be used for regression-type and classification-type problems. ^[2] ^[3]
  - Bootstrap aggregated (or bagged) decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling training data with replacement, and voting the trees for a consensus prediction. ^[4]
    - A random forest classifier is a specific type of bootstrap aggregating
  - Rotation forest - in which every decision tree is trained by first applying principal component analysis (PCA) on a random subset of the input features. ^[5]
    A special case of a decision tree is a decision list, which is a one-sided decision tree, so that every internal node has exactly 1 leaf node and exactly 1 internal node as a child (except for the bottommost node, whose only child is a single leaf node). While less expressive, decision lists are arguably easier to understand than general decision trees due to their added sparsity, permit non-greedy learning methods and monotonic constraints to be imposed.

↑ Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
↑ Friedman, J. H. (1999). Stochastic gradient boosting. Stanford University.
↑ Hastie, T., Tibshirani, R., Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer Verlag.
↑ Breiman, L. (1996). Bagging Predictors. “Machine Learning, 24": pp. 123-140.
↑ Rodriguez, J.J. and Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-1630.

[bfos-1] Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.

[2] Friedman, J. H. (1999). Stochastic gradient boosting. Stanford University.

[3] Hastie, T., Tibshirani, R., Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer Verlag.

[4] Breiman, L. (1996). Bagging Predictors. “Machine Learning, 24": pp. 123-140.

[5] Rodriguez, J.J. and Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-1630.

[1]

[2]

[3]

[4]

[5]

Gradient Tree Boosting System

References

2017a

2017b

Navigation menu

Search