An improved algorithm to accelerate regular expression evaluation

  • Authors:
  • Michela Becchi;Patrick Crowley

  • Affiliations:
  • Washington University, St. Louis, MO;Washington University, St. Louis, MO

  • Venue:
  • Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern network intrusion detection systems need to perform regular expression matching at line rate in order to detect the occurrence of critical patterns in packet payloads. While deterministic finite automata (DFAs) allow this operation to be performed in linear time, they may exhibit prohibitive memory requirements. In [9], Kumar et al. propose Delayed Input DFAs (D2FAs), which provide a trade-off between the memory requirements of the compressed DFA and the number of states visited for each character processed, which corresponds directly to the memory bandwidth required to evaluate regular expressions. In this paper we introduce a general compression technique that results in at most 2N state traversals when processing a string of length N. In comparison to the D2FA approach, our technique achieves comparable levels of compression, with lower provable bounds on memory bandwidth (or greater compression for a given bandwidth bound). Moreover, our proposed algorithm has lower complexity, is suitable for scenarios where a compressed DFA needs to be dynamically built or updated, and fosters locality in the traversal process. Finally, we also describe a novel alphabet reduction scheme for DFA-based structures that can yield further dramatic reductions in data structure size.