Automation Framework for Large-Scale Regular Expression Matching on FPGA

  • Authors:
  • Thilan Ganegedara;Yi-Hua E. Yang;Viktor K. Prasanna

  • Affiliations:
  • -;-;-

  • Venue:
  • FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an extensible automation framework for constructing and optimizing large-scale regular expression matching (REM) circuits on FPGA. Paralleling the technique used by software compilers, we divide our framework into two parts: a frontend that parses each PCRE-formatted regular expression (regex) into a modular non-deterministic finite automaton (RE-NFA), followed by a backend that generates the REM circuit design for a multi-pipeline architecture. With such organization, various pattern and circuit level optimizations can be applied to the frontend and backend, respectively. The multi-pipeline architecture utilizes both logic slices and on-chip BRAM for optimized character matching; in addition, it can be configured at compile-time to produce concurrent matching outputs from multiple RE-NFAs. Our framework prototype handles up to 64k "regular" regexes with arbitrary complexity and number of states, limited only by the hardware resources of the target device. Running on a commodity 2.3 GHz PC (AMD Opteron 1356), it takes less than a minute for the framework to convert ~1800 regexes used by the Snort IDS into RTL-level designs with optimized logic and memory usage. Such an automation framework could be invaluable to REM systems to update regex definitions with minimal human intervention.