1999 AnEmpiricalComparisonofVotingCl

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Empirical Evaluation, Bagging Algorithm, Boosting Algorithm, AdaBoost Algorithm, Wagging Algorithm.

Notes

Cited By

Quotes

Author Keywords

Abstract

Methods for voting classification algorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and real-world datasets. We review these algorithms and describe a large empirical study comparing several variants in conjunction with a decision tree inducer (three variants) and a Naive-Bayes inducer. The purpose of the study is to improve our understanding of why and when these algorithms, which use perturbation, re-weighting, and combination techniques, affect classification error. We provide a bias and variance decomposition of the error to show how different methods and variants influence these two terms. This allowed us to determine that Bagging reduced variance of unstable methods, while boosting methods (AdaBoost and Arc-x4) reduced both the bias and variance of unstable methods but increased the variance for Naive-Bayes, which was very stable. We observed that Arc-x4 behaves differently than AdaBoost if reweighting is used instead of resampling, indicating a fundamental difference. Voting variants, some of which are introduced in this paper, include: pruning versus no pruning, use of probabilistic estimates, weight perturbations (Wagging), and backfitting of data. We found that Bagging improves when probabilistic estimates in conjunction with no-pruning are used, as well as when the data was backfit. We measure tree sizes and show an interesting positive correlation between the increase in the average tree size in AdaBoost trials and its success in reducing the error. We compare the mean-squared error of voting methods to non-voting methods and show that the voting methods lead to large and significant reductions in the mean-squared errors. Practical problems that arise in implementing boosting algorithms are explored, including numerical instabilities and underflows. We use scatterplots that graphically show how AdaBoost reweights instances, emphasizing not only “hard” areas but also outliers and noise.

1. Introduction

(...)

7.5. Wagging and Backfitting Data

An interesting variant of Bagging that we tried is called Wagging (WeightAggregation). This method seeks to repeatedly perturb the training set as in Bagging, but instead of sampling from it, Wagging adds Gaussian noise to each weight with mean zero and a given standard deviation (e.g., 2). For each trial, we start with uniformly weighted instances, add noise to the weights, and induce a classifier. The method has the nice property that one can trade off bias and variance: by increasing the standard deviation of the noise we introduce, more instances will have their weight decrease to zero and disappear, thus increasing bias and reducing variance. Experiments showed that with a standard deviation of 2–3, the method finishes head-to-head with the best variant of Bagging used above, i.e., the error of Bagged MC4 without pruning and with scoring was 10.21% and the errors for Wagging with 2, 2.5, and 3 were 10.19, 10.16, and 10.12%. These differences are not significant. Results for Naive-Bayes were similar.

References

BibTeX

@article{1999_AnEmpiricalComparisonofVotingCl,
  author    = {Eric Bauer and
               Ron Kohavi},
  title     = {An Empirical Comparison of Voting Classification Algorithms: Bagging,
               Boosting, and Variants},
  journal   = {Mach. Learn.},
  volume    = {36},
  number    = {1-2},
  pages     = {105--139},
  year      = {1999},
  url       = {https://doi.org/10.1023/A:1007515423169},
  doi       = {10.1023/A:1007515423169},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1999 AnEmpiricalComparisonofVotingClRon Kohavi
Eric Bauer
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants10.1023/A:10075154231691999