Component-based LR parsing

Authors:
Xiaoqing Wu;Barrett R. Bryant;Jeff Gray;Marjan Mernik
Affiliations:
Bank of America Corporation, CH20, 4500 Park Granada, Calabasas, CA 91302, USA;The University of Alabama at Birmingham, Department of Computer and Information Sciences, 115A Campbell Hall, 1300 University Boulevard, Birmingham, Alabama 35294-1170, USA;The University of Alabama at Birmingham, Department of Computer and Information Sciences, 115A Campbell Hall, 1300 University Boulevard, Birmingham, Alabama 35294-1170, USA;University of Maribor, Faculty of Electrical Engineering and Computer Science, Smetanova ulica 17, 2000 Maribor, Slovenia
Venue:
Computer Languages, Systems and Structures
Year:
2010

Citing 24
Cited 1

Lazy recursive descent parsing for modular language implementation

Software—Practice & Experience
Techniques for modular language implementation

Acta Cybernetica
Design patterns: elements of reusable object-oriented software

Design patterns: elements of reusable object-oriented software
ANTLR: a predicated-LL(k) parser generator

Software—Practice & Experience
Component software: beyond object-oriented programming

Component software: beyond object-oriented programming
Understanding SQL and Java together: a guide to SQLJ, JDBC, and related technologies

Understanding SQL and Java together: a guide to SQLJ, JDBC, and related technologies
Design Rules: The Power of Modularity Volume 1

Design Rules: The Power of Modularity Volume 1
The Java Language Specification

The Java Language Specification
Cracking the 500-Language Problem

IEEE Software
Delegating compiler objects: modularity and reusability in language engineering

Nordic Journal of Computing
Essentials of Constraint Programming

Essentials of Constraint Programming
Generating Robust Parsers using Island Grammars

WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Current Parsing Techniques in Software Renovation Considered Harmful

IWPC '98 Proceedings of the 6th International Workshop on Program Comprehension
Parsing expression grammars: a recognition-based syntactic foundation

Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Monadic parsing in Haskell

Journal of Functional Programming
DMS®: Program Transformations for Practical Scalable Software Evolution

Proceedings of the 26th International Conference on Software Engineering
When and how to develop domain-specific languages

ACM Computing Surveys (CSUR)
Producing the left parse during bottom-up parsing

Information Processing Letters
Better extensibility through modular syntax

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Conquer Compiler Complexity

Conquer Compiler Complexity
TXL - A Language for Programming Language Tools and Applications

Electronic Notes in Theoretical Computer Science (ENTCS)
The Grammar Tool Box: A Case Study Comparing GLR Parsing Algorithms

Electronic Notes in Theoretical Computer Science (ENTCS)
Polyglot: an extensible compiler framework for Java

CC'03 Proceedings of the 12th international conference on Compiler construction
Incremental programming language development

Computer Languages, Systems and Structures

Verifiable parse table composition for deterministic parsing

SLE'09 Proceedings of the Second international conference on Software Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

A language implementation with proper compositionality enables a compiler developer to divide-and-conquer the complexity of building a large language by constructing a set of smaller languages. Ideally, these small language implementations should be independent of each other such that they can be designed, implemented and debugged individually, and later be reused in different applications (e.g., building domain-specific languages). However, the language composition offered by several existing parser generators resides at the grammar level, which means all the grammar modules need to be composed together and all corresponding ambiguities have to be resolved before generating a single parser for the language. This produces tight coupling between grammar modules, which harms information hiding and affects independent development of language features. To address this problem, we have developed a novel parsing algorithm that we call Component-based LR (CLR) parsing, which provides code-level compositionality for language development by producing a separate parser for each grammar component. In addition to shift and reduce actions, the algorithm extends general LR parsing by introducing switch and return actions to empower the parsing action to jump from one parser to another. Our experimental evaluation demonstrates that CLR increases the comprehensibility, reusability, changeability and independent development ability of the language implementation. Moreover, the loose coupling among parser components enables CLR to describe grammars that contain LR parsing conflicts or require ambiguous token definitions, such as island grammars and embedded languages.