LL(*): the foundation of the ANTLR parser generator

Authors:
Terence Parr;Kathleen Fisher
Affiliations:
University of San Francisco, San Francisco, CA, USA;Tufts University, Boston, MA, USA
Venue:
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Year:
2011

Citing 14
Cited 14

Compact recursive-descent parsing of expressions

Software—Practice & Experience
A practical method for constructing efficient LALR(K) parsers with automatic error recovery

A practical method for constructing efficient LALR(K) parsers with automatic error recovery
Obtaining practical variants of LL (K) and LR (K) for K greater than 1 by splitting the atomic K-tuple

Obtaining practical variants of LL (K) and LR (K) for K greater than 1 by splitting the atomic K-tuple
An efficient context-free parsing algorithm

Communications of the ACM
Transition network grammars for natural language analysis

Communications of the ACM
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Packrat parsing:: simple, powerful, lazy, linear time, functional pearl

Proceedings of the seventh ACM SIGPLAN international conference on Functional programming
LL(k) Parsing for Attributed Grammars

Proceedings of the 6th Colloquium, on Automata, Languages and Programming
Adding Semantic and Syntactic Predicates To LL(k): pred-LL(k)

CC '94 Proceedings of the 5th International Conference on Compiler Construction
Parsing expression grammars: a recognition-based syntactic foundation

Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Practical experiments with regular approximation of context-free languages

Computational Linguistics - Special issue on finite-state methods in NLP
Better extensibility through modular syntax

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
LR-regular grammars An extension of LR(k) grammars

SWAT '71 Proceedings of the 12th Annual Symposium on Switching and Automata Theory (swat 1971)
Semantics and algorithms for data-dependent grammars

Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Defining contexts in context-free grammars

LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
SuperC: parsing all of C by taming the preprocessor

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Parse forest diagnostics with dr. ambiguity

SLE'11 Proceedings of the 4th international conference on Software Language Engineering
Evaluating the design of the R language: objects and functions for data analysis

ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
MDE basics with a DSL focus

SFM'12 Proceedings of the 12th international conference on Formal Methods for the Design of Computer, Communication, and Software Systems: formal methods for model-driven engineering
Functional semantics of parsing actions, and left recursion elimination as continuation passing

Proceedings of the 14th symposium on Principles and practice of declarative programming
Natural and Flexible Error Recovery for Generated Modular Language Environments

ACM Transactions on Programming Languages and Systems (TOPLAS)
Left recursion in parsing expression grammars

SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
Adaptable parsing expression grammars

SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
Detecting source code similarity using code abstraction

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
LLLR parsing

Proceedings of the 28th Annual ACM Symposium on Applied Computing
CPS: stateful policy enforcement for control system device usage

Proceedings of the 29th Annual Computer Security Applications Conference
Automated Insertion of Exception Handling for Key and Referential Constraints

Journal of Database Management
Extending the PCRE Library with Static Backtracking Based Just-in-Time Compilation Support

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite the power of Parser Expression Grammars (PEGs) and GLR, parsing is not a solved problem. Adding nondeterminism (parser speculation) to traditional LL and LR parsers can lead to unexpected parse-time behavior and introduces practical issues with error handling, single-step debugging, and side-effecting embedded grammar actions. This paper introduces the LL(*) parsing strategy and an associated grammar analysis algorithm that constructs LL(*) parsing decisions from ANTLR grammars. At parse-time, decisions gracefully throttle up from conventional fixed k=1 lookahead to arbitrary lookahead and, finally, fail over to backtracking depending on the complexity of the parsing decision and the input symbols. LL(*) parsing strength reaches into the context-sensitive languages, in some cases beyond what GLR and PEGs can express. By statically removing as much speculation as possible, LL(*) provides the expressivity of PEGs while retaining LL's good error handling and unrestricted grammar actions. Widespread use of ANTLR (over 70,000 downloads/year) shows that it is effective for a wide variety of applications.