Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
The probabilistic communication complexity of set intersection
SIAM Journal on Discrete Mathematics
On the distributional complexity of disjointness
Theoretical Computer Science
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Communication complexity
Distributed computing: a locality-sensitive approach
Distributed computing: a locality-sensitive approach
Gossip-Based Computation of Aggregate Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
A note on efficient aggregate queries in sensor networks
Theoretical Computer Science
Tight bounds for distributed selection
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Cache-oblivious comparison-based algorithms on multisets
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Tracking distributed aggregates over time-based sliding windows
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
The complexity of data aggregation in directed networks
DISC'11 Proceedings of the 25th international conference on Distributed computing
Tracking distributed aggregates over time-based sliding windows
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
This paper studies the problem of computing the most frequent element (the mode) by means of a distributed algorithm where the elements are located at the nodes of a network. Let k denote the number of distinct elements and further let mi be the number of occurrences of the element ei in the ordered list of occurrences m1m2≥ ... ≥ mk. We give a deterministic distributed algorithm with time complexity O(D+k) where D denotes the diameter of the graph, which is essentially tight. As our main contribution, a Monte Carlo algorithm is presented which computes the mode in O(D + F2/m12*log k) time with high probability, where the frequency moment Ft is defined as Ft = sumi=1k mit. This algorithm is substantially faster than the deterministic algorithm for various relevant frequency distributions. Moreover, we provide a lower bound of Omega(D+F5/(m15B)), where B is the maximum message size, that captures the effect of the frequency distribution on the time complexity to compute the mode.