Multibox Parsers: No More Handwritten Lexical Analyzers

Authors:
Lev J. Dyadkin
Affiliations:
-
Venue:
IEEE Software
Year:
1995

Citing 6
Cited 0

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Crafting a compiler with C

Crafting a compiler with C
Design maintenance systems

Communications of the ACM
Dynamic parsers and evolving grammars

ACM SIGPLAN Notices
Compiler Design Theory

Compiler Design Theory
Implementation of a portable Fortran 77 compiler using modern tools

SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tools are available to generate the parser part of the compiler front end from the grammar describing the language being parsed. Tools like Lex/Yacc assume that the parser has two parts, or "boxes": the lexical analyzer and the syntax analyzer. This approach poses significant problems for lexically complex languages like Fortran because one box for the entire lexical analysis is not enough to express grammatically the level of complexity. As a result, compiler writers abandon Lex (which does the lexical analysis) and produce handwritten lexical analyzers, thus defeating the main purpose of the parser generator, which is to automate the production of the entire parser. An alternative to the two-box parser, and one that overcomes these complexity problems, is the multibox parser. Instead of having a box for lexical analysis and a box for syntax analysis, the multibox parser has a string of boxes. Each box modifies its input language to produce a more "straightened" output language for the next box. The number of boxes needed depends on the complexity of the language to be parsed. This multibox approach allows the automatic generation of a lexical analyzer regardless of the language to be parsed because it has enough boxes to handle the level of lexical complexity, even in languages as complex as the new Fortran 90 standard.Although the approach has been used in constructing compilers only for Fortran 90, it is suitable for the construction of compilers for other languages as well. In this case, the number and design of the boxes and corresponding grammars must provide for the new language.