Performance study of peer-to-peer file sharing

  • Authors:
  • Leonard Kleinrock;Saurabh Tewari

  • Affiliations:
  • University of California, Los Angeles;University of California, Los Angeles

  • Venue:
  • Performance study of peer-to-peer file sharing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This dissertation evaluates the basic performance issues in peer-to-peer file sharing systems. We obtain analytic expressions for the search time (for locating a source to download the file from) and the query-processing load (the number of nodes queried) as a function of the number of replicas of the file being searched for the three basic search mechanisms—Distributed Hash Table (DHT) based search in structured networks, controlled flooding search in unstructured networks and random walk search in unstructured networks—both when the demand pattern is uniform and when the demand pattern exhibits clustering. We find the minimum average search time to be related to the entropy of the file request probability distribution and provide an information-theoretic explanation for this by mapping the search problem to a noiseless coding problem. We also show that replicating files in proportion to their request rates (i.e. proportional replication ) minimizes the search time for controlled flooding search in unstructured networks as well as the search time and the query-processing load in structured networks. Proportional replication is also shown to minimize the download time and the network bandwidth used (the download cost) and, at the same time, ensure fairness in download load distribution and stability in files that are shared. Proportional replication is shown to be "natural" in the sense that standard cache replacement policies like LRU and FIFO automatically give near-optimal replica distributions; the system performance is not very sensitive near the optimal values; and cache replacement algorithms like LRU and FIFO adapt to changes in file request patterns relatively quickly. We also find that almost all of the clustering benefit is achieved without precise tuning of the clustering in topology when LRU or FIFO like cache replacement algorithms are used unless the demand pattern exhibits extreme clustering. Thus, even in unstructured networks where the query-processing load can be 3-4 times higher than the minimum possible with proportional replication (for typical file request rate patterns), the far greater reductions in the search time, the download time and the network bandwidth used of proportional replication make it the preferred approach for most applications.