Memory Efficient Algorithm for Mining Recent Frequent Items in a Stream
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
The Frequent Items Problem, under Polynomial Decay, in the Streaming Model
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
APPROX/RANDOM'10 Proceedings of the 13th international conference on Approximation, and 14 the International conference on Randomization, and combinatorial optimization: algorithms and techniques
Discovering trending phrases on information streams
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimizing adaptive multi-route query processing via time-partitioned indices
Journal of Computer and System Sciences
High throughput heavy hitter aggregation for modern SIMD processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Simple and deterministic matrix sketching
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
File recipe compression in data deduplication systems
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Two algorithms are presented for finding the values that occur more than $n \div k$ times in array b[O:n-1]. The second algorithm requires time $O(n \log(k))$ and extra space $O(k)$. We prove that $O(n \log(k))$ is a lower bound on the time required for any algorithm based on comparing array elements, so that the second algorithm is optimal. As special cases, determining whether a value occurs more than $n \div 2$ times requires linear time, but determining whether there are duplicates the case $k=n$ requires time $O(n \log(n))$. The algorithms may be interesting from a standpoint of programming methodology; each was developed as an extension of an algorithm for the simple case $k=2$.