Sampling bias in BitTorrent measurements

  • Authors:
  • Boxun Zhang;Alexandru Iosup;Johan Pouwelse;Dick Epema;Henk Sips

  • Affiliations:
  • Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands

  • Venue:
  • EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real-world measurements play an important role in understanding the characteristics and in improving the operation of BitTorrent, which is currently a popular Internet application. Much like measuring the Internet, the complexity and scale of the BitTorrent network make a single, complete measurement impractical. While a large number of measurements have already employed diverse sampling techniques to study parts of BitTorrent network, until now there exists no investigation of their sampling bias, that is, of their ability to objectively represent the characteristics of BitTorrent. In this work we present the first study of the sampling bias in BitTorrent measurements. We first introduce a novel taxonomy of sources of sampling bias in BitTorrent measurements. We then investigate the sampling among fifteen longterm BitTorrent measurements completed between 2004 and 2009, and find that different data sources and measurement techniques can lead to significantly different measurement results. Last, we formulate three recommendations to improve the design of future BitTorrent measurements, and estimate the cost of using these recommendations in practice.