2012 EfficientFrequentItemCountingin
- (Roy et al., 2012) ⇒ Pratanu Roy, Jens Teubner, and Gustavo Alonso. (2012). “Efficient Frequent Item Counting in Multi-core Hardware.” In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2012). ISBN:978-1-4503-1462-6 doi:10.1145/2339530.2339757
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222012%22+Efficient+Frequent+Item+Counting+in+Multi-core+Hardware
- http://dl.acm.org/citation.cfm?id=2339530.2339757&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
The increasing number of cores and the rich instruction sets of modern hardware are opening up new opportunities for optimizing many traditional data mining tasks. In this paper we demonstrate how to speed up the performance of the computation of frequent items by almost one order of magnitude over the best published results by matching the algorithm to the underlying hardware architecture.
We start with the observation that frequent item counting, like other data mining tasks, assumes certain amount of skew in the data. We exploit this skew to design a new algorithm that uses a pre-filtering stage that can be implemented in a highly efficient manner through SIMD instructions. Using pipelining, we then combine this pre-filtering stage with a conventional frequent item algorithm (Space-Saving) that will process the remainder of the data. The resulting operator can be parallelized with a small number of cores, leading to a parallel implementation that does not suffer any of the overheads of existing parallel solutions when querying the results and offers significantly higher throughput.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2012 EfficientFrequentItemCountingin | Pratanu Roy Jens Teubner Gustavo Alonso | Efficient Frequent Item Counting in Multi-core Hardware | 10.1145/2339530.2339757 | 2012 |