Partial content distribution on high performance networks

  • Authors:
  • Eric H. Weigle;Andrew A. Chien

  • Affiliations:
  • University of California- San Diego;University of California- San Diego

  • Venue:
  • Proceedings of the 16th international symposium on High performance distributed computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present and analyze techniques to efficiently solve the partial content distribution problem-distributing a logical data set to receivers which individually desire only subsets of the total data. This is a more general and fundamentally different problem than traditional whole-file content distribution; providing new challenges and new optimization opportunities. It supports a wider variety of use models, e.g., striped file transfer, scatter/gather, or distributed editing. This work develops new metadata management and transfer scheduling techniques providing good results on high performance networks. Distributed applicationsin such systems tend to have data requirements more complicated than just total overlap at every node: transfers desired differ dramatically from whole-file content distribution. Traditional approaches perform poorly in such cases. We provide empirical data exhibiting these limitations, evaluate a new BitTorrent-based implementation of our ideas, and show order of magnitude improvements in bandwidth and latency.