The design of a parser generator

Authors:
David A. Workman;John B. Higdon
Affiliations:
Florida Technological University;Florida Technological University
Venue:
ACM-SE 16 Proceedings of the 16th annual Southeast regional conference
Year:
1978

Citing 2
Cited 0

An efficient context-free parsing algorithm

Communications of the ACM
The Theory of Parsing, Translation, and Compiling

The Theory of Parsing, Translation, and Compiling

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present an algorithm for generating bounded-context parsers using a modified version of Knuth's algorithm for generating LR(k) parsers. We also describe the internal design of a parser generating system capable of constructing the parsing tables for (s,l) bounded-context and LR(1) parsers. The parser generating system is written in PL/1 and is operational on an IBM-370/165. The data we have collected to date on the performance of the system is encouraging; to build the states of an LR(1) parser for a 488 production grammar for the proposed ANS FORTRAN (SIGPLAN 1976) required 480K bytes of core and 3.5 minutes of CPU time.Our algorithm for generating bounded-context parsers produces a canonical set of parsing tables similar to the "action" and "goto" tables required for canonical LR(k) parsers. A parameter, s, input to the algorithm specifies how many symbols may be used to make a decision in any given state of the parser. If storage is more of a premium than time, the goto-table can be organized in such a way that its size is a function only of the number of states. The algorithm has several advantages over traditional algorithms for bounded-context and precedence parsing. First, the theory is parallel to that for LR(k) parsing and thus makes possible a unified approach to the study of bottom-up parsing. Second, the differences between precedence and bounded-context algorithms are emphasized by the theory and born out in the efficiency of their respective implementations. Finally, the algorithm is "restartable;" that is, if a deterministic parser is not obtained for a given value of "s," much of the information generated for case "s" can be preserved and need not be regenerated for the case "s+1."