Joint data streaming and sampling techniques for detection of super sources and destinations

  • Authors:
  • Qi Zhao;Abhishek Kumar;Jun Xu

  • Affiliations:
  • College of Computing, Georgia Institute of Technology;College of Computing, Georgia Institute of Technology;College of Computing, Georgia Institute of Technology

  • Venue:
  • IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detecting the sources or destinations that have communicated with a large number of distinct destinations or sources during a small time interval is an important problem in network measurement and security. Previous detection approaches are not able to deliver the desired accuracy at high link speeds (10 to 40 Gbps). In this work, we propose two novel algorithms that provide accurate and efficient solutions to this problem. Their designs are based on the insight that sampling and data streaming are often suitable for capturing different and complementary regions of the information spectrum, and a close collaboration between them is an excellent way to recover the complete information. Our first solution builds on the standard hash-based flow sampling algorithm. Its main innovation is that the sampled traffic is further filtered by a data streaming module which allows for much higher sampling rate and hence much higher accuracy. Our second solution is more sophisticated but offers higher accuracy. It combines the power of data streaming in efficiently estimating quantities associated with a given identity, and the power of sampling in collecting a list of candidate identities. The performance of both solutions are evaluated using both mathematical analysis and trace-driven experiments on real-world Internet traffic.