Better extensibility through modular syntax

  • Authors:
  • Robert Grimm

  • Affiliations:
  • New York University

  • Venue:
  • Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore how to make the benefits of modularity available for syntactic specifications and present Rats!, a parser generator for Java that supports easily extensible syntax. Our parser generator builds on recent research on parsing expression grammars (PEGs), which, by being closed under composition, prioritizing choices, supporting unlimited lookahead, and integrating lexing and parsing, offer an attractive alternative to context-free grammars. PEGs are implemented by so-called packrat parsers, which are recursive descent parsers that memoize all intermediate results (hence their name). Memoization ensures linear-time performance in the presence of unlimited lookahead, but also results in an essentially lazy, functional parsing technique. In this paper, we explore how to leverage PEGs and packrat parsers as the foundation for extensible syntax. In particular, we show how make packrat parsing more widely applicable by implementing this lazy, functional technique in a strict, imperative language, while also generating better performing parsers through aggressive optimizations. Next, we develop a module system for organizing, modifying, and composing large-scale syntactic specifications. Finally, we describe a new technique for managing (global) parsing state in functional parsers. Our experimental evaluation demonstrates that the resulting parser generator succeeds at providing extensible syntax. In particular, Rats! enables other grammar writers to realize real-world language extensions in little time and code, and it generates parsers that consistently out-perform parsers created by two GLR parser generators.