Efficient frequent item counting in multi-core hardware
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
We present a deterministic parallel algorithm for the k-majority problem, that can be used to find in parallel frequent items, i.e. those whose multiplicity is greater than a given threshold, and is therefore useful to process iceberg queries and in many other different contexts of applied mathematics and information theory. The algorithm can be used both in the online (stream) context and in the offline setting, the difference being that in the former case we are restricted to a single scan of the input elements, so that verifying the frequent items that have been determined is not allowed (e.g. network traffic streams passing through internet routers), while in the latter a parallel scan of the input can be used to determine the actual k-majority elements. To the best of our knowledge, this is the first parallel algorithm solving the proposed problem. Copyright © 2011 John Wiley & Sons, Ltd.