Computing Frequent Elements Using Gossip

Authors:
Bibudh Lahiri;Srikanta Tirthapura
Affiliations:
Iowa State University, Ames, USA 50011;Iowa State University, Ames, USA 50011
Venue:
SIROCCO '08 Proceedings of the 15th international colloquium on Structural Information and Communication Complexity
Year:
2008

Citing 12
Cited 1

Epidemic algorithms for replicated database maintenance

PODC '87 Proceedings of the sixth annual ACM Symposium on Principles of distributed computing
Randomized algorithms

Randomized algorithms
Min-wise independent permutations (extended abstract)

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
Gossip-Based Computation of Aggregate Information

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Efficient top-K query calculation in distributed networks

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Finding (Recently) Frequent Items in Distributed Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Finding global icebergs over distributed data sets

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Communication-efficient distributed monitoring of thresholded counts

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Computing separable functions via gossip

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Randomized gossip algorithms

IEEE/ACM Transactions on Networking (TON) - Special issue on networking and information theory
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

ProFID: Practical frequent items discovery in peer-to-peer networks

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present algorithms for identifying frequently occurring elements in a large distributed data set using gossip. Our algorithms do not rely on any central control, or on an underlying network structure, such as a spanning tree. Instead, nodes repeatedly select a random partner and exchange data with the partner --- if this process continues for a (short) period of time, the desired results are computed, with probabilistic guarantees on the accuracy. Our algorithm for frequent elements is built by layering a novel small space "sketch" of data over a gossip-based data dissemination mechanism. We prove that the algorithm converges to the approximate frequent elements with high probability, and provide bounds on the time till convergence. To our knowledge, this is the first work on computing frequent elements using gossip.