An In-Depth, Analytical Study of Sampling Techniques for Self-Similar Internet Traffic

  • Authors:
  • Guanghui He;Jennifer C. Hou

  • Affiliations:
  • University of Illinois at Urbana Champaign;University of Illinois at Urbana Champaign

  • Venue:
  • ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Techniques for sampling Internet traffic are very important to understand the traffic characteristics of the Internet [14, 8]. In spite of all the research efforts on packet sampling, none has taken into account of self-similarity of Internet traffic in devising sampling strategies. In this paper, we perform an in-depth, analytical study of three sampling techniques for self-similar Internet traffic, namely static systematic sampling, stratified random sampling and simple random sampling. We show that while all three sampling techniques can accurately capture the Hurst parameter (second order statistics) of Internet traffic, they fail to capture the mean (first order statistics) faithfully. We also show that static systematic sampling renders the smallest variation of sampling results in different instances of sampling (i.e., it gives sampling results of high fidelity). Based on an important observation, we then devise a new variation of static systematic sampling, called biased systematic sampling (BSS), that gives much more accurate estimates of the mean, while keeping the sampling overhead low. Both the analysis on the three sampling techniques and the evaluation of BSS are performed on synthetic and real Internet traffic traces. Our performance study shows that BSS gives a performance improvement of 40% and 20% (in terms of efficiency) as compared to static systematic and simple random sampling.