An efficient augmented-context-free parsing algorithm
Computational Linguistics
On multiple context-free grammars
Theoretical Computer Science
Journal of Functional Programming
When and how to develop domain-specific languages
ACM Computing Surveys (CSUR)
Visual language implementation through standard compiler-compiler techniques
Journal of Visual Languages and Computing
Incremental parsing with parallel multiple context-free grammars
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
PGF: A Portable Run-time Format for Type-theoretical Grammars
Journal of Logic, Language and Information
The naproche project controlled natural language proof checking of mathematical texts
CNL'09 Proceedings of the 2009 conference on Controlled natural language
DynGenPar: a dynamic generalized parser for common mathematical language
CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics
DynGenPar: a dynamic generalized parser for common mathematical language
CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics
Hi-index | 0.00 |
This paper introduces a dynamic generalized parser aimed primarily at common natural mathematical language. Our algorithm combines the efficiency of GLR parsing, the dynamic extensibility of tableless approaches and the expressiveness of extended context-free grammars such as parallel multiple context-free grammars (PMCFGs). In particular, it supports efficient dynamic rule additions to the grammar at any moment. The algorithm is designed in a fully incremental way, allowing to resume parsing with additional tokens without restarting the parse process, and can predict possible next tokens. Additionally, we handle constraints on the token following a rule. This allows for grammatically correct English indefinite articles when working with word tokens. It can also represent typical operations for scannerless parsing such as maximal matches when working with character tokens. Our long-term goal is to computerize a large library of existing mathematical knowledge using the new parser, starting from natural language input as found in textbooks or in the papers collected by the digital mathematical library (DML) projects around the world. In this paper, we present the algorithmic ideas behind our approach, give a short overview of the implementation, and present some efficiency results. The new parser is available at http://www.tigen.org/kevin.kofler/fmathl/dyngenpar/ .