Probabilistic lossy counting: an efficient algorithm for finding heavy hitters

  • Authors:
  • Xenofontas Dimitropoulos;Paul Hurley;Andreas Kind

  • Affiliations:
  • IBM Zurich Research Laboratory;IBM Zurich Research Laboratory;IBM Zurich Research Laboratory

  • Venue:
  • ACM SIGCOMM Computer Communication Review
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowledge of the largest traffic ows in a network is important for many network management applications. The problem of finding these ows is known as the heavy-hitter problem and has been the subject of many studies in the past years. One of the most efficient and well-known algorithms for finding heavy hitters is lossy counting [29]. In this work we introduce probabilistic lossy counting (PLC), which enhances lossy counting in computing network traffic heavy hitters. PLC uses on a tighter error bound on the estimated sizes of traffic ows and provides probabilistic rather than deterministic guarantees on its accuracy. The probabilistic-based error bound substantially improves the memory consumption of the algorithm. In addition, PLC reduces the rate of false positives of lossy counting and achieves a low estimation error, although slightly higher than that of lossy counting We compare PLC with state-of-the-art algorithms for finding heavy hitters. Our experiments using real traffic traces find that PLC has 1) between 34.4% and 74% lower memory consumption, 2) between 37.9% and 40.5% fewer false positives than lossy counting, and 3) a small estimation error.