On sampling self-similar internet traffic

  • Authors:
  • Guanghui He;Jennifer C. Hou

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana Champaign, Urbana, IL

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Techniques for sampling Internet traffic are very important to understand the traffic characteristics of the Internet [A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, F. True, Deriving traffic demands for operational ip networks: methodology and experience, in: Proc. ACM SIGCOMM'00, August 2000, pp. 257-270; N.G. Duffield, M. Grossglauser, Trajectory sampling for direct traffic observation, in: Proc. ACM SIGCOMM'00, August 2000, pp. 271-282]. In spite of all the research efforts on packet sampling, none has taken into account of self-similarity of Internet traffic in devising sampling strategies. In this paper, we perform an in-depth, analytical study of three sampling techniques for self-similar Internet traffic, namely static systematic sampling, stratified random sampling and simple random sampling. We show that while all three sampling techniques can accurately capture the Hurst parameter (second order statistics) of Internet traffic, they fail to capture the mean (first order statistics) faithfully. We also show that static systematic sampling renders the smallest variation of sampling results in different instances of sampling (i.e., it gives sampling results of high fidelity). Based on an important observation, we then devise a new variation of static systematic sampling, called biased systematic sampling (BSS), that gives much more accurate estimates of the mean, while keeping the sampling overhead low. Both the analysis on the three sampling techniques and the evaluation of BSS are performed on synthetic and real Internet traffic traces. Our performance study shows that BSS gives a performance improvement of 40% and 20% (in terms of efficiency) as compared to static systematic and simple random sampling.