2012 EfficientFrequentItemCountingin

Subject Headings:

Notes

We start with the observation that frequent item counting, like other data mining tasks, assumes certain amount of skew in the data. We exploit this skew to design a new algorithm that uses a pre-filtering stage that can be implemented in a highly efficient manner through SIMD instructions. Using pipelining, we then combine this pre-filtering stage with a conventional frequent item algorithm (Space-Saving) that will process the remainder of the data. The resulting operator can be parallelized with a small number of cores, leading to a parallel implementation that does not suffer any of the overheads of existing parallel solutions when querying the results and offers significantly higher throughput.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2012 EfficientFrequentItemCountingin	Pratanu Roy Jens Teubner Gustavo Alonso			Efficient Frequent Item Counting in Multi-core Hardware				10.1145/2339530.2339757		2012