Compact samples for data dissemination

  • Authors:
  • Tova Milo;Assaf Sagi;Elad Verbin

  • Affiliations:
  • School of Computer Science, Tel Aviv University;School of Computer Science, Tel Aviv University;School of Computer Science, Tel Aviv University

  • Venue:
  • ICDT'07 Proceedings of the 11th international conference on Database Theory
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider data dissemination in a peer-to-peer network, where each user wishes to obtain some subset of the available information objects. In most of the modern algorithms for such data dissemination, the users periodically obtain samples of peer IDs (possibly with some summary of their content). They then use the samples for connecting to other peers and downloading data pieces from them. For a set O of information objects, we call a sample of peers, containing at least k possible providers for each object o∈O, a k-sample. In order to balance the load, the k-samples should be fair, in the sense that for every object, its providers should appear in the sample with equal probability. Also, since most algorithms send fresh samples frequently, the size of the k-samples should be as small as possible, to minimize communication overhead. We describe in this paper two novel techniques for generating fair and small k-samples in a P2P setting. The first is based on a particular usage of uniform sampling and has the advantage that it allows to build on standard P2P uniform sampling tools. The second is based on non-uniform sampling and requires more particular care, but is guaranteed to generate the smallest possible fair k-sample. The two algorithms exploit available dependencies between information objects to reduce the sample size, and are proved, both theoretically and experimentally, to be extremely effective.