Trained Regression Tree Model
A Trained Regression Tree Model is a decision tree model composed of regression tree nodes.
- Context:
- It can (often) be used a Regressed Point Estimator.
- It can be produced by a Regression Tree Learning System (that implements a Regression Tree Induction Algorithm).
- It can be pruned by a Regression Tree Post-Pruning System (that implements a Regression Tree Post-Pruning Algorithm).
- Example(s):
- one created by a sklearn.tree.DecisionTreeRegressor.
- one created by a CTree System.
- …
- Counter-Example(s):
- See: Piecewise Constant Regression Model, Regression Tree Learning System, Classification Tree Learning System, Regression Task, Classification Task, Decision Tree Ensemble Learning System.
References
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Decision_tree_learning Retrieved:2017-10-15.
- Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modelling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees ; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.
In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data (but the resulting classification tree can be an input for decision making). This page deals with decision trees in data mining.
- Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modelling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees ; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.
1999
- (Torgo, 1999) ⇒ Louis Torgo. (1999). “Inductive Learning of Tree-based Regression Models." Ph.D. Thesis, Thesis, Faculty of Sciences, University of Porto
- ABSTRACT: This thesis explores different aspects of the induction of tree-based regression models from data. The main goal of this study is to improve the predictive accuracy of regression trees, while retaining as much as possible their comprehensibility and computational efficiency. Our study is divided in three main parts.
In the first part we describe in detail two different methods of growing a regression tree: minimising the mean squared error and minimising the mean absolute deviation. Our study is particularly focussed on the computational efficiency of these tasks. We present several new algorithms that lead to significant computational speed-ups. We also describe an experimental comparison of both methods of growing a regression tree highlighting their different application goals.
Pruning is a standard procedure within tree-based models whose goal is to provide a good compromise for achieving simple and comprehensible models with good predictive accuracy. In the second part of our study we describe a series of new techniques for pruning by selection from a series of alternative pruned trees. We carry out an extensive set of experiments comparing different methods of pruning, which show that our proposed techniques are able to significantly outperform the predictive accuracy of current state of the art pruning algorithms in a large set of regression domains.
In the final part of our study we present a new type of tree-based models that we refer to as local regression trees. These hybrid models integrate tree-based regression with local modelling techniques. We describe different types of local regression trees and show that these models are able to significantly outperform standard regression trees in terms of predictive accuracy. Through a large set of experiments we prove the competitiveness of local regression trees when compared to existing regression techniques.
- ABSTRACT: This thesis explores different aspects of the induction of tree-based regression models from data. The main goal of this study is to improve the predictive accuracy of regression trees, while retaining as much as possible their comprehensibility and computational efficiency. Our study is divided in three main parts.
1997
- (Wang & Witten, 1997) ⇒ Yong Wang, and Ian H. Witten. (1997). “Inducing Model Trees for Continuous Classes.” In: Proceedings of the European Conference on Machine Learning.
- ABSTRACT: Many problems encountered when applying machine learning in practice involve predicting a class that takes on a continuous numeric value, yet few machine learning schemes are able to do this. This paper describes a rational reconstruction of M5, a method developed by Quinlan (1992) for inducing trees of regression models. In order to accommodate data typically encountered in practice it is necessary to deal effectively with enumerated attributes and with missing values, and techniques devised by Breiman et al. (1984) are adapted for this purpose. The resulting system seems to outperform M5, based on the scanty published data that is available.