Fisher information of sampled packets: an application to flow size estimation

  • Authors:
  • Bruno Ribeiro;Don Towsley;Tao Ye;Jean C. Bolot

  • Affiliations:
  • University of Massachusetts at Amherst, Amherst, MA;University of Massachusetts at Amherst, Amherst, MA;Sprint ATL, Burlingame, CA;Sprint ATL, Burlingame, CA

  • Venue:
  • Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Packet sampling is widely used in network monitoring. Sampled packet streams are often used to determine flow-level statistics of network traffic. To date there is conflicting evidence on the quality of the resulting estimates. In this paper we take a systematic approach, using the Fisher information metric and the Cramér-Rao bound, to understand the contributions that different types of information within sampled packets have on the quality of flow-level estimates. We provide concrete evidence that, without protocol information and with packet sampling rate p = 0.005, any accurate unbiased estimator needs approximately 1016 sampled flows. The required number of sampled flows drops to roughly 104 with the use of TCP sequence numbers. Furthermore, additional SYN flag information significantly reduces the estimation error of short flows. We present a Maximum Likelihood Estimator (MLE) that relies on all of this information and show that it is efficient, even when applied to a small sample set. We validate our results using Tier-1 Internet backbone traces and evaluate the benefits of sampling from multiple monitors. Our results show that combining estimates from several monitors is 50% less accurate than an estimate based on all samples.