LR(1) parser generation system: LR(1) error recovery, oracles, and generic tokens

  • Authors:
  • Arthur Sorkin;Peter Donovan

  • Affiliations:
  • Web Oasis, Inc, Mesa, AZ;Adobe Systems Inc., San Jose, CA

  • Venue:
  • ACM SIGSOFT Software Engineering Notes
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The LR(1) Parser Generation System generates full LR(1) parsers that are comparable in speed and size to those generated by LALR(1) parser generators, such as yacc [5]. In addition to the inherent advantages of full LR(1) parsing, it contains a number of novel features. This paper discusses three of them in detail: an LR(1) grammar specified automatic error recovery algorithm, oracles, and generic tokens. The error recovery algorithm depends on the fact that full LR(1) parse tables preserve context. Oracles are pieces of code that are defined in a grammar and that are executed between the scanner and parser. They are used to resolve token ambiguities, including semantic ones. Generic tokens are used to replace syntactically identical tokens with a single token, which is, in effect, a variable representing a set of tokens.