Bitmap algorithms for counting active flows on high-speed links

Authors:
Cristian Estan;George Varghese;Michael Fisk
Affiliations:
Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI;Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA;Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA
Venue:
IEEE/ACM Transactions on Networking (TON)
Year:
2006

Citing 10
Cited 15

Probabilistic counting algorithms for data base applications

Journal of Computer and System Sciences
A linear-time probabilistic counting algorithm for database applications

ACM Transactions on Database Systems (TODS)
Controlling high bandwidth aggregates in the network

ACM SIGCOMM Computer Communication Review
Practical automated detection of stealthy portscans

Journal of Computer Security
New directions in traffic measurement and accounting

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Properties and prediction of flow statistics from sampled packet streams

Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Bitmap algorithms for counting active flows on high speed links

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Snort - Lightweight Intrusion Detection for Networks

LISA '99 Proceedings of the 13th USENIX conference on System administration
FlowScan: A Network Traffic Flow Reporting and Visualization Tool

LISA '00 Proceedings of the 14th USENIX conference on System administration
A robust system for accurate real-time summaries of internet traffic

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

A simple and efficient estimation method for stream expression cardinalities

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Estimating cardinality distributions in network traffic: extended abstract

SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Rd network services: differentiation through performance incentives

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Note: Order statistics and estimating cardinalities of massive data sets

Discrete Applied Mathematics
An improved analysis of the lossy difference aggregator

ACM SIGCOMM Computer Communication Review
An online framework for catching top spreaders and scanners

Computer Networks: The International Journal of Computer and Telecommunications Networking
An optimal algorithm for the distinct elements problem

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Measurement data reduction through variation rate metering

INFOCOM'10 Proceedings of the 29th conference on Information communications
Transformation-based parallelization of request-processing applications

MODELS'10 Proceedings of the 13th international conference on Model driven engineering languages and systems: Part II
Fit a compact spread estimator in small high-speed memory

IEEE/ACM Transactions on Networking (TON)
Per-flow traffic measurement through randomized counter sharing

IEEE/ACM Transactions on Networking (TON)
HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm

Proceedings of the 16th International Conference on Extending Database Technology
Spreader classification based on optimal dynamic bit sharing

IEEE/ACM Transactions on Networking (TON)
Estimating duplication by content-based sampling

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
A grand spread estimator using a graphics processing unit

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a family of bitmap algorithms that address the problem of counting the number of distinct header patterns (flows) seen on a high-speed link. Such counting can be used to detect DoS attacks and port scans and to solve measurement problems. Counting is especially hard when processing must be done within a packet arrival time (8 ns at OC-768 speeds) and, hence, may perform only a small number of accesses to limited, fast memory. A naive solution that maintains a hash table requires several megabytes because the number of flows can be above a million. By contrast, our new probabilistic algorithms use little memory and are fast. The reduction in memory is particularly important for applications that run multiple concurrent counting instances. For example, we replaced the port-scan detection component of the popular intrusion detection system Snort with one of our new algorithms. This reduced memory usage on a ten minute trace from 50 to 5.6 MB while maintaining a 99.77% probability of alarming on a scan within 6 s of when the large-memory algorithm would. The best known prior algorithm (probabilistic counting) takes four times more memory on port scan detection and eight times more on a measurement application. This is possible because our algorithms can be customized to take advantage of special features such as a large number of instances that have very small counts or prior knowledge of the likely range of the count.