Reversible sketches for efficient and accurate change detection over network data streams

Authors:
Robert Schweller;Ashish Gupta;Elliot Parsons;Yan Chen
Affiliations:
Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL
Venue:
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Year:
2004

Citing 16
Cited 13

Probabilistic counting algorithms for data base applications

Journal of Computer and System Sciences
Coding and information theory (2nd ed.)

Coding and information theory (2nd ed.)
Introduction to algorithms

Introduction to algorithms
Bro: a system for detecting network intruders in real-time

Computer Networks: The International Journal of Computer and Telecommunications Networking
New directions in traffic measurement and accounting

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Properties and prediction of flow statistics from sampled packet streams

Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Data streams: algorithms and applications

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
How to Own the Internet in Your Spare Time

Proceedings of the 11th USENIX Security Symposium
Intrusion Detection via Static Analysis

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Sketch-based change detection: methods, evaluation, and applications

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Flow sampling under hard resource constraints

Proceedings of the joint international conference on Measurement and modeling of computer systems
Holistic UDAFs at streaming speeds

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Intrusion detection using sequences of system calls

Journal of Computer Security
Finding hierarchical heavy hitters in data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Integrated access control and intrusion detection for Web servers

IEEE Transactions on Parallel and Distributed Systems

Improving sketch reconstruction accuracy using linear least squares method

IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
High-speed detection of unsolicited bulk emails

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
Probabilistic lossy counting: an efficient algorithm for finding heavy hitters

ACM SIGCOMM Computer Communication Review
Information Assurance: Dependability and Security in Networked Systems

Information Assurance: Dependability and Security in Networked Systems
The eternal sunshine of the sketch data structure

Computer Networks: The International Journal of Computer and Telecommunications Networking
Coordinated weighted sampling for estimating aggregates over multiple weight assignments

Proceedings of the VLDB Endowment
HiFIND: A high-speed flow-level intrusion detection approach with DoS resiliency

Computer Networks: The International Journal of Computer and Telecommunications Networking
Detection of traffic changes in large-scale backbone networks: The case of the Spanish academic network

Computer Networks: The International Journal of Computer and Telecommunications Networking
Disclosing the element distribution of bloom filter

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Detecting anomalies in backbone network traffic: a performance comparison among several change detection methods

International Journal of Sensor Networks
Software defined traffic measurement with OpenSketch

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
A methodological overview on anomaly detection

DataTraffic Monitoring and Analysis
Mining most frequently changing component in evolving graphs

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traffic anomalies such as failures and attacks are increasing in frequency and severity, and thus identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows and looks for heavy changes in traffic patterns (e.g., volume, number of connections). However, as link speeds and the number of flows increase, keeping per-flow state is not scalable. The recently proposed sketch-based schemes [14] are among the very few that can detect heavy changes and anomalies over massive data streams at network traffic speeds. However, sketches do not preserve the key (e.g., source IP address) of the flows. Hence, even if anomalies are detected, it is difficult to infer the culprit flows, making it a big practical hurdle for online deployment. Meanwhile, the number of keys is too large to record. To address this challenge, we propose efficient reversible hashing algorithms to infer the keys of culprit flows from sketches without storing any explicit key information. No extra memory or memory accesses are needed for recording the streaming data. Meanwhile, the heavy change detection daemon runs in the background with space complexity and computational time sublinear to the key space size. This short paper describes the conceptual framework of the reversible sketches, as well as some initial approaches for implementation. See [23] for the optimized algorithms in details. comment We further apply various emph IP-mangling algorithms and emph bucket classification methods to reduce the false positives and false negatives. Evaluated with netflow traffic traces of a large edge router, we demonstrate that the reverse hashing can quickly infer the keys of culprit flows even for many changes with high accuracy.