Sequential hashing: A flexible approach for unveiling significant patterns in high speed networks

  • Authors:
  • Tian Bu;Jin Cao;Aiyou Chen;Patrick P. C. Lee

  • Affiliations:
  • Bell Labs, Alcatel-Lucent, 600-700 Mountain Avenue, Murray Hill NJ 07974, USA;Bell Labs, Alcatel-Lucent, 600-700 Mountain Avenue, Murray Hill NJ 07974, USA;Bell Labs, Alcatel-Lucent, 600-700 Mountain Avenue, Murray Hill NJ 07974, USA;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shartin, N.T., Hong Kong

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identification of significant patterns in network traffic, such as IPs or flows that contribute large volume (heavy hitters) or those that introduce large changes of volume (heavy changers), has many applications in accounting and network anomaly detection. As network speed and the number of flows grow rapidly, identifying heavy hitters/changers by tracking per-IP or per-flow statistics becomes infeasible due to both the computational overhead and memory requirements. In this paper, we propose SeqHash, a novel sequential hashing scheme that supports fast and accurate recovery of heavy hitters/changers, while requiring memory just slightly higher than the theoretical lower bound. SeqHash monitors data traffic using a sketch data structure that can flexibly trade-off between the memory usage and the computational overhead in a large range that can be utilized by different computer architectures for optimizing the overall performance. In addition, we propose statistically efficient algorithms for estimating the values of heavy hitters/changers. Using both mathematical analysis and experimental studies of Internet traces, we demonstrate that SeqHash can achieve the same accuracy as the existing methods do but using much less memory and computational overhead.