Exploiting packet-sampling measurements for traffic characterization and classification

  • Authors:
  • Davide Tammaro;Silvio Valenti;Dario Rossi;Antonio Pescapé

  • Affiliations:
  • INFRES Department, TELECOM ParisTech, 75634, Paris, France;INFRES Department, TELECOM ParisTech, 75634, Paris, France;INFRES Department, TELECOM ParisTech, 75634, Paris, France;Department of Computer Science and Systems, University of Naples Federico II, 80125, Naples, Italy

  • Venue:
  • International Journal of Network Management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of packet sampling for traffic measurement has become mandatory for network operators to cope with the huge amount of data transmitted in today's networks, powered by increasingly faster transmission technologies. Therefore, many networking tasks must already deal with such reduced data, more available but less rich in information. In this work we assess the impact of packet sampling on various network monitoring-activities, with a particular focus on traffic characterization and classification. We process an extremely heterogeneous dataset composed of four packet-level traces (representative of different access technologies and operational environments) with a traffic monitor able to apply different sampling policies and rates to the traffic and extract several features both in aggregated and per-flow fashion, providing empirical evidences of the impact of packet sampling on both traffic measurement and traffic classification. First, we analyze feature distortion, quantified by means of two statistical metrics: most features appear already deteriorated under low sampling step, no matter the sampling policy, while only a few remain consistent under harsh sampling conditions, which may even cause some artifacts, undermining the correctness of measurements. Second, we evaluate the performance of traffic classification under sampling. The information content of features, even though deteriorated, still allows a good classification accuracy, provided that the classifier is trained with data obtained at the same sampling rate of the target data. The accuracy is also due to a thoughtful choice of a smart sampling policy which biases the sampling towards packets carrying the most useful information. Copyright © 2012 John Wiley & Sons, Ltd.