Regular-expression derivatives re-examined

  • Authors:
  • Scott Owens;John Reppy;Aaron Turon

  • Affiliations:
  • University of cambridge (e-mail: scott.owens@cl.cam.ac.uk);University of chicago (e-mail: jhr@cs.uchicago.edu);University of chicago, northeastern university (e-mail: turon@ccs.neu.edu)

  • Venue:
  • Journal of Functional Programming
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Regular-expression derivatives are an old, but elegant, technique for compiling regular expressions to deterministic finite-state machines. It easily supports extending the regular-expression operators with boolean operations, such as intersection and complement. Unfortunately, this technique has been lost in the sands of time and few computer scientists are aware of it. In this paper, we reexamine regular-expression derivatives and report on our experiences in the context of two different functional-language implementations. The basic implementation is simple and we show how to extend it to handle large character sets (e.g., Unicode). We also show that the derivatives approach leads to smaller state machines than the traditional algorithm given by McNaughton and Yamada.