2011 EnablingFastPredictionforEnsemb
- (Zhang et al., 2011) ⇒ Peng Zhang, Jun Li, Peng Wang, Byron J. Gao, Xingquan Zhu, and Li Guo. (2011). “Enabling Fast Prediction for Ensemble Models on Data Streams.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020442
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222011%22+Enabling+Fast+Prediction+for+Ensemble+Models+on+Data+Streams
- http://dl.acm.org/citation.cfm?id=2020408.2020442&preflayout=flat#citedby
Quotes
Author Keywords
- Algorithms; concept drifting; data mining; ensemble learning; performance; spatial indexing; stream classification; stream data mining
Abstract
Ensemble learning has become a common tool for data stream classification, being able to handle large volumes of stream data and concept drifting. Previous studies focus on building accurate prediction models from stream data. However, a linear scan of a large number of base classifiers in the ensemble during prediction incurs significant costs in response time, preventing ensemble learning from being practical for many real world time-critical data stream applications, such as Web traffic stream monitoring, spam detection, and intrusion detection. In these applications, data streams usually arrive at a speed of GB/second, and it is necessary to classify each stream record in a timely manner. To address this problem, we propose a novel Ensemble-tree (E-tree for short) indexing structure to organize all base classifiers in an ensemble for fast prediction. On one hand, E-trees treat ensembles as spatial databases and employ an R-tree like height-balanced structure to reduce the expected prediction time from linear to sub-linear complexity. On the other hand, E-trees can automatically update themselves by continuously integrating new classifiers and discarding outdated ones, well adapting to new trends and patterns underneath data streams. Experiments on both synthetic and real-world data streams demonstrate the performance of our approach.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 EnablingFastPredictionforEnsemb | Xingquan Zhu Peng Zhang Jun Li Peng Wang Byron J. Gao Li Guo | Enabling Fast Prediction for Ensemble Models on Data Streams | 10.1145/2020408.2020442 | 2011 |