Extending the PCRE Library with Static Backtracking Based Just-in-Time Compilation Support

  • Authors:
  • Zoltán Herczeg

  • Affiliations:
  • University of Szeged Department of Software Engineering 13 Dugonics Square Szeged, Hungary

  • Venue:
  • Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

High matching performance of regular expressions is a critical requirement for many widely used software tools today, including web servers, firewalls, and intrusion detection systems. Backtracking regular expression engines have been considerably improved in the last decade as a result of this requirement. Today, state of the art engines use just-in-time (JIT) compilation support to generate machine code from regular expressions, and they use new, innovative techniques to further improve the speed of the generated code. In the present paper, we introduce a new technique called static backtracking, which allows simultaneous optimization of both matching and backtracking. Based on this technique, we developed a JIT compiler for the widely used PCRE regular expression library. Our compiler supports all valid PCRE patterns, which shows that static backtracking is a viable choice for Perl compatible engines. We also show that our balanced, Abstract Syntax Tree based code generator efficiently improves the performance of long-running, backtracking heavy regular expressions. Compared to another JIT accelerated regular expression engine, PCRE-JIT was able to run these patterns 1.95 times faster. Since these long-running patterns dominate the total runtime, PCRE-JIT achieved 1.63 times faster matching speed overall. We also observed 6.36 times average speedup compared to the PCRE interpreter on 5 different CPU architectures.