Simple, functional, sound and complete parsing for all context-free grammars

  • Authors:
  • Tom Ridge

  • Affiliations:
  • University of Leicester, UK

  • Venue:
  • CPP'11 Proceedings of the First international conference on Certified Programs and Proofs
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parsers for context-free grammars can be implemented directly and naturally in a functional style known as “combinator parsing”, using recursion following the structure of the grammar rules. However, naive implementations fail to terminate on left-recursive grammars, and despite extensive research the only complete parsers for general context-free grammars are constructed using other techniques such as Earley parsing. Our main contribution is to show how to construct simple, sound and complete parser implementations directly from grammar specifications, for all context-free grammars, based on combinator parsing. We then construct a generic parser generator and show that generated parsers are sound and complete. The formal proofs are mechanized using the HOL4 theorem prover. Memoized parsers based on our approach are polynomial-time in the size of the input. Preliminary real-world performance testing on highly ambiguous grammars indicates our parsers are faster than those generated by the popular Happy parser generator.