Regression Tree Learning System

Context:
- It can be a component of a Decision Tree Ensemble Learning Regressor.
Example(s):
- rpart System.
- sklearn.tree.DecisionTreeRegressor.
- A Least Squares Regression Tree System.
- A First-Order Regression Tree System.
- …
Counter-Example(s):
See: Regression System, Classification System, Trained Regression Tree Model, Kernel-based Classification Algorithm.

References

(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Decision_tree_learning Retrieved:2017-10-15.
- Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modelling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees ; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.
  In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data (but the resulting classification tree can be an input for decision making). This page deals with decision trees in data mining.

(Torgo, 2017) ⇒ Luı́s Torgo, (2017). "Regression Trees". In Encyclopedia of Machine Learning and Data Mining pp 1080-1083.
- QUOTE: Regression trees are supervised learning methods that address multiple regression problems. They provide a tree-based approximation f^, of an unknown regression function [math]\displaystyle{ Y \in \mathcal{R} }[/math] and [math]\displaystyle{ \epsilon \approx N (0, \sigma^2) }[/math], based on a given sample of data [math]\displaystyle{ D=\{\langle x_i^1,\cdots, x_i^p,y_i\}^n_{i=1} }[/math]. The obtained models consist of a hierarchy of logical tests on the values of any of the [math]\displaystyle{ p }[/math] predictor variables. The terminal nodes of these trees, known as the leaves, contain the numerical predictions of the model for the target variable [math]\displaystyle{ Y }[/math]

(Furnkranz, 2017) ⇒ Johannes Fürnkranz, (2017). "Decision Tree". In Encyclopedia of Machine Learning and Data Mining pp 330-335.
- ABSTRACT: The induction of decision trees is one of the oldest and most popular techniques for learning discriminatory models, which has been developed independently in the statistical (Breiman et al. 1984^[1]; Kass 1980 ^[2]) and machine learning (Hunt et al. 1966 ^[3]; Quinlan 1983^[4], 1986^[5]) communities. A decision tree is a tree-structured classification model, which is easy to understand, even by non-expert users, and can be efficiently induced from data. An extensive survey of decision-tree learning can be found in Murthy (1998).
- QUOTE: Decision trees are also often used as components in Ensemble Methods such as random forests (Breiman 2001 ^[6]) or AdaBoost (Freund and Schapire 1996^[7]). They can also be modified for predicting numerical target variables, in which case they are known as regression trees. One can also put more complex prediction models into the leaves of a tree, resulting in Model Trees.

(scikit-learn.org, 2015) ⇒ 1.10.2. Regression. In User Guide, Documentation of scikit-learn 0.19.1.
- Decision trees can also be applied to regression problems, using the DecisionTreeRegressor class.
  As in the classification setting, the fit method will take as argument arrays X and y, only that in this case y is expected to have floating point values instead of integer values:

(Loh, 2011) ⇒ Loh, W. Y. (2011). Classification and regression trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14-23.
- ABSTRACT: Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree. Classification trees are designed for dependent variables that take a finite number of unordered values, with prediction error measured in terms of misclassification cost. Regression trees are for dependent variables that take continuous or ordered discrete values, with prediction error typically measured by the squared difference between the observed and predicted values. This article gives an introduction to the subject by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.

↑ Breiman L, Friedman JH, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth & Brooks, Pacific Grove
↑ Kass GV (1980) An exploratory technique for investigating large quantities of categorical data. Appl Stat 29:119–127
↑ Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic, New York
↑ Quinlan JR (1983) Learning efficient classification procedures and their application to chess end games. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning. An artificial intelligence approach, Tioga, Palo Alto, pp 463–482
↑ Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
↑ Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
↑ Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Saitta L (ed) Proceedings of the 13th International Conference on Machine Learning, Bari. Morgan Kaufmann, pp 148–156