Stacked Ensemble-based Learning Algorithm

AKA: Boosting, Boosting Learning Algorithm
Example(s):
- a Stacked Auto-Encoding Algorithm.
- a Stacked Decision Tree Algorithm.
Counter-Example(s):
- a Bagging Algorithm.
- a Random Forests Algorithm.
- a Boosting Algorithm
See: Bootstrapping Algorithm; Large-Margin Algorithm; Additive Model.

References

(Wikipedia, 2012) ⇒ http://en.wikipedia.org/wiki/Ensemble_learning#Stacking
- Stacking (sometimes called stacked generalization) involves training a learning algorithm to combine the predictions of several other learning algorithms. First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make a final prediction using all the predictions of the other algorithms as additional inputs. If an arbitrary combiner algorithm is used, then stacking can theoretically represent any of the ensemble techniques described in this article, although, in practice, a logistic regression model is often used as the combiner.
  Stacking typically yields performance better than any single one of the trained models.^[1] It has been successfully used on both supervised learning tasks (regression,^[2] classification and distance learning ^[3]) and unsupervised learning (density estimation).^[4] It has also been used to estimate bagging's error rate.^[5]^[6] It has been reported to out-perform Bayesian model-averaging.^[7] The two top-performers in the Netflix competition utilized blending, which may be considered to be a form of stacking.^[8]

(Domingos, 2012) ⇒ Pedro Domingos. (2012). “A Few Useful Things to Know About Machine Learning." In: Communications of the ACM Journal, 55(10). doi:10.1145/2347736.2347755
- ... In stacking, the outputs of individual classifiers become the inputs of a "higher-level" learner that figures out how best to combine them.

↑ Wolpert, D., Stacked Generalization., Neural Networks, 5(2), pp. 241-259., 1992
↑ Breiman, L., Stacked Regression, Machine Learning, 24, 1996 Template:Doi
↑ Ozay, M.; Yarman Vural, F. T. (2013). A New Fuzzy Stacked Generalization Technique and Analysis of its Performance. arXiv:1204.0171.
↑ Smyth, P. and Wolpert, D. H., Linearly Combining Density Estimators via Stacking, Machine Learning Journal, 36, 59-83, 1999
↑ Cite error: Invalid <ref> tag; no text was provided for refs named Rokach2010
↑ Wolpert, D.H., and Macready, W.G., An Efficient Method to Estimate Bagging’s Generalization Error, Machine Learning Journal, 35, 41-55, 1999
↑ Clarke, B., Bayes model averaging and stacking when model approximation error cannot be ignored, Journal of Machine Learning Research, pp 683-712, 2003
↑ Sill, J.; Takacs, G.; Mackey, L.; Lin, D. (2009). Feature-Weighted Linear Stacking. arXiv:0911.0460.