Adaptive shared-state sampling

Authors:
Frederic Raspall;Sebastia Sallent
Affiliations:
Technical University of Catalonia, Barcelona, Spain;Technical University of Catalonia, Barcelona, Spain
Venue:
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
Year:
2008

Citing 13
Cited 3

Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

ACM Transactions on Computer Systems (TOCS)
Identifying frequent items in sliding windows over on-line packet streams

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Finding Frequent Items in Sliding Windows with Multinomially-Distributed Item Frequencies

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Approximate counts and quantiles over sliding windows

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
What's hot and what's not: tracking most frequent items dynamically

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
An improved data stream summary: the count-min sketch and its applications

Journal of Algorithms
Shared-state sampling

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Automated worm fingerprinting

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Probabilistic lossy counting: an efficient algorithm for finding heavy hitters

ACM SIGCOMM Computer Communication Review
How to scalably and accurately skip past streams

ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop

Efficient packet sampling for accurate traffic measurements

Computer Networks: The International Journal of Computer and Telecommunications Networking
Modeling conservative updates in multi-hash approximate count sketches

Proceedings of the 24th International Teletraffic Congress
Scalable identification and measurement of heavy-hitters

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present two algorithms to the problem of identifying and measuring heavy-hitters. Our schemes report, with high probability, those flows that exceed a prescribed share of the traffic observed so far; along with an estimate of their sizes. One of the biggest advantages of our schemes is that they entirely rely on sampling. This makes them flexible and lightweight, permits implementing them in cheap DRAM and scale to very high speeds. Despite sampling, our algorithms can provide very accurate results and offer performance guarantees independent of the traffic mix. Most remarkably, the schemes are shown to require memory that is constant regardless of the volume and composition of the traffic observed. Thus, besides computationally light, cost-effective and flexible, they are scalable and robust against malicious traffic patterns. We provide theoretical and empirical results on their performance; the latter, with software implementations and real traffic traces.