Two-Pass greedy regular expression parsing

  • Authors:
  • Niels Bjørn Bugge Grathwohl;Fritz Henglein;Lasse Nielsen;Ulrik Terp Rasmussen

  • Affiliations:
  • Department of Computer Science, University of Copenhagen (DIKU), Denmark;Department of Computer Science, University of Copenhagen (DIKU), Denmark;Department of Computer Science, University of Copenhagen (DIKU), Denmark;Department of Computer Science, University of Copenhagen (DIKU), Denmark

  • Venue:
  • CIAA'13 Proceedings of the 18th international conference on Implementation and Application of Automata
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present new algorithms for producing greedy parses for regular expressions (REs) in a semi-streaming fashion. Our lean-log algorithm executes in time O(mn) for REs of size m and input strings of size n and outputs a compact bit-coded parse tree representation. It improves on previous algorithms by: operating in only 2 passes; using only O(m) words of random-access memory (independent of n); requiring only kn bits of sequentially written and read log storage, where $k