Investigating self-similarity and heavy-tailed distributions on a large-scale experimental facility

  • Authors:
  • Patrick Loiseau;Paulo Gonçalves;Guillaume Dewaele;Pierre Borgnat;Patrice Abry;Pascale Vicat-Blanc Primet

  • Affiliations:
  • INRIA Paris-Rocquencourt, Le Chesnay Cedex, France and École Normale Supérieure de Lyon, LIP, Université de Lyon, Lyon Cedex 07, France;INRIA, École Normale Supérieure de Lyon, LIP, Université de Lyon, Lyon Cedex 07, France;Laboratoire de Physique, École Normale Supérieure de Lyon, Université de Lyon, Lyon Cedex 07, France;Laboratoire de Physique, CNRS, UMR, École Normale Supérieure de Lyon, Université de Lyon, Lyon Cedex 07, France;Laboratoire de Physique, CNRS, UMR, École Normale Supérieure de Lyon, Université de Lyon, Lyon Cedex 07, France;INRIA, École Normale Supérieure de Lyon, LIP, Université de Lyon, Lyon Cedex 07, France

  • Venue:
  • IEEE/ACM Transactions on Networking (TON)
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

After the seminal work by Taqqu et al. relating self-similarity to heavy-tailed distributions, a number of research articles verified that aggregated Internet traffic time series show self-similarity and that Internet attributes, like Web file sizes and flow lengths, were heavy-tailed. However, the validation of the theoretical prediction relating self-similarity and heavy tails remains unsatisfactorily addressed, being investigated using either numerical or network simulations, or from uncontrolled Web traffic data. Notably, this prediction has never been conclusively verified on real networks using controlled and stationary scenarios, prescribing specific heavy-tailed distributions, and estimating confidence intervals. With this goal in mind, we use the potential and facilities offered by the large-scale, deeply reconfigurable and fully controllable experimental Grid5000 instrument, combined with state-of-the-art estimators, to investigate the prediction's observability on real networks. To this end, we organize a large number of controlled traffic circulation sessions on a nationwide real network involving 200 independent hosts. We use a FPGA-based measurement system to collect the corresponding traffic at packet level. We then estimate both the self-similarity exponent of the aggregated time series and the heavy-tail index of flow-size distributions, independently. Not only do our results complement and validate, with a striking accuracy, some conclusions drawn from a series of pioneering studies, but they also bring in new insights on the controversial role of certain components of real networks.