Scannerless NSLR(1) parsing of programming languages
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
An attribute evaluation of context-free languages
Information Processing Letters
An efficient context-free parsing algorithm
Communications of the ACM
Transition network grammars for natural language analysis
Communications of the ACM
The next 700 programming languages
Communications of the ACM
Packrat parsing:: simple, powerful, lazy, linear time, functional pearl
Proceedings of the seventh ACM SIGPLAN international conference on Functional programming
Rule splitting and attribute-directed parsing
Semantics-Directed Compiler Generation, Proceedings of a Workshop
Attribute-influenced LR parsing
Semantics-Directed Compiler Generation, Proceedings of a Workshop
Disambiguation Filters for Scannerless Generalized LR Parsers
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Attribute-Directed Top-Down Parsing
CC '92 Proceedings of the 4th International Conference on Compiler Construction
Elkhound: A Fast, Practical GLR Parser Generator
Elkhound: A Fast, Practical GLR Parser Generator
ICFP '03 Proceedings of the eighth ACM SIGPLAN international conference on Functional programming
Parsing expression grammars: a recognition-based syntactic foundation
Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An extension of earley's algorithm for S-attributed grammars
EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
PADS: a domain-specific language for processing ad hoc data
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The next 700 data description languages
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ACM Transactions on Programming Languages and Systems (TOPLAS)
PADS/ML: a functional data description language
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
From dirt to shovels: fully automatic tool generation from ad hoc data
Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proceedings of the 14th International Conference on Database Theory
Bringing domain-specific languages to digital forensics
Proceedings of the 33rd International Conference on Software Engineering
A new method for dependent parsing
ESOP'11/ETAPS'11 Proceedings of the 20th European conference on Programming languages and systems: part of the joint European conferences on theory and practice of software
Delayed semantic actions in Yakker
Proceedings of the Eleventh Workshop on Language Descriptions, Tools and Applications
Proceedings of the Eleventh Workshop on Language Descriptions, Tools and Applications
LL(*): the foundation of the ANTLR parser generator
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Formal network packet processing with minimal fuss: invertible syntax descriptions at work
PLPV '12 Proceedings of the sixth workshop on Programming languages meets program verification
The Semantics of Parsing with Semantic Actions
LICS '12 Proceedings of the 2012 27th Annual IEEE/ACM Symposium on Logic in Computer Science
Adaptable parsing expression grammars
SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
Hi-index | 0.01 |
We present the design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications. In particular, our new parsing engine handles (1) full scannerless context-free grammars with (2) regular expressions as right-hand sides for defining nonterminals. YAKKER also includes (3) facilities for binding variables to intermediate parse results and (4) using such bindings within arbitrary constraints to control parsing. These facilities allow the kind of data-dependent parsing commonly needed in systems applications, particularly those that operate over binary data. In addition, (5) nonterminals may be parameterized by arbitrary values, which gives the system good modularity and abstraction properties in the presence of data-dependent parsing. Finally, (6) legacy parsing libraries,such as sophisticated libraries for dates and times, may be directly incorporated into parser specifications. We illustrate the importance and utility of this rich collection of features by presenting its use on examples ranging from difficult programming language grammars to web server logs to binary data specification. We also show that our grammars have important compositionality properties and explain why such properties areimportant in modern applications such as automatic grammar induction. In terms of technical contributions, we provide a traditional high-level semantics for our new grammar formalization and show how to compile grammars into non deterministic automata. These automata are stack-based, somewhat like conventional push-down automata,but are also equipped with environments to track data-dependent parsing state. We prove the correctness of our translation of data-dependent grammars into these new automata and then show how to implement the automata efficiently using a variation of Earley's parsing algorithm.