Ensemble-based Prediction Function

From GM-RKB
(Redirected from ensemble model)
Jump to navigation Jump to search

An Ensemble-based Prediction Function is a composite function structure that is composed of many "base" prediction functions.



References

2017A

2017B

  • (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Decision_tree_learning#Decision_tree_types Retrieved:2017-10-15.
    • Decision trees used in data mining are of two main types:
      • Classification tree analysis is when the predicted outcome is the class to which the data belongs.
      • Regression tree analysis is when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient's length of stay in a hospital).
    • The term Classification And Regression Tree (CART) analysis is an umbrella term used to refer to both of the above procedures, first introduced by Breiman et al.[1] Trees used for regression and trees used for classification have some similarities - but also some differences, such as the procedure used to determine where to split.

      Some techniques, often called ensemble methods, construct more than one decision tree:

      • Boosted trees Incrementally building an ensemble by training each new instance to emphasize the training instances previously mis-modeled. A typical example is AdaBoost. These can be used for regression-type and classification-type problems. [2] [3]
      • Bootstrap aggregated (or bagged) decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling training data with replacement, and voting the trees for a consensus prediction. [4]
      • Rotation forest - in which every decision tree is trained by first applying principal component analysis (PCA) on a random subset of the input features. [5]

        A special case of a decision tree is a decision list, which is a one-sided decision tree, so that every internal node has exactly 1 leaf node and exactly 1 internal node as a child (except for the bottommost node, whose only child is a single leaf node). While less expressive, decision lists are arguably easier to understand than general decision trees due to their added sparsity, permit non-greedy learning methods and monotonic constraints to be imposed.

        Decision tree learning is the construction of a decision tree from class-labeled training tuples. A decision tree is a flow-chart-like structure, where each internal (non-leaf) node denotes a test on an attribute, each branch represents the outcome of a test, and each leaf (or terminal) node holds a class label. The topmost node in a tree is the root node.

        There are many specific decision-tree algorithms. Notable ones include:

      • ID3 (Iterative Dichotomiser 3) * C4.5 (successor of ID3)
      • CART (Classification And Regression Tree)
      • CHAID (CHi-squared Automatic Interaction Detector). Performs multi-level splits when computing classification trees.
      • MARS: extends decision trees to handle numerical data better.
      • Conditional Inference Trees. Statistics-based approach that uses non-parametric tests as splitting criteria, corrected for multiple testing to avoid overfitting. This approach results in unbiased predictor selection and does not require pruning.[6] [7]
    • ID3 and CART were invented independently at around the same time (between 1970 and 1980), yet follow a similar approach for learning decision tree from training tuples.
  1. Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
  2. Friedman, J. H. (1999). Stochastic gradient boosting. Stanford University.
  3. Hastie, T., Tibshirani, R., Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer Verlag.
  4. Breiman, L. (1996). Bagging Predictors. “Machine Learning, 24": pp. 123-140.
  5. Rodriguez, J.J. and Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-1630.
  6. Hothorn, T.; Hornik, K.; Zeileis, A. (2006). “Unbiased Recursive Partitioning: A Conditional Inference Framework". Journal of Computational and Graphical Statistics. 15 (3): 651–674. JSTOR 27594202. doi:10.1198/106186006X133933.
  7. Strobl, C.; Malley, J.; Tutz, G. (2009). “An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests". Psychological Methods. 14 (4): 323–348. doi:10.1037/a0016973.