DynGenPar: a dynamic generalized parser for common mathematical language

Authors:
Kevin Kofler;Arnold Neumaier
Affiliations:
Faculty of Mathematics, University of Vienna, Austria, Wien, Austria;Faculty of Mathematics, University of Vienna, Austria, Wien, Austria
Venue:
CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics
Year:
2012

Citing 9
Cited 1

An efficient augmented-context-free parsing algorithm

Computational Linguistics
On multiple context-free grammars

Theoretical Computer Science
Grammatical Framework

Journal of Functional Programming
When and how to develop domain-specific languages

ACM Computing Surveys (CSUR)
Visual language implementation through standard compiler-compiler techniques

Journal of Visual Languages and Computing
Incremental parsing with parallel multiple context-free grammars

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
PGF: A Portable Run-time Format for Type-theoretical Grammars

Journal of Logic, Language and Information
The naproche project controlled natural language proof checking of mathematical texts

CNL'09 Proceedings of the 2009 conference on Controlled natural language
DynGenPar: a dynamic generalized parser for common mathematical language

CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics

DynGenPar: a dynamic generalized parser for common mathematical language

CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a dynamic generalized parser aimed primarily at common natural mathematical language. Our algorithm combines the efficiency of GLR parsing, the dynamic extensibility of tableless approaches and the expressiveness of extended context-free grammars such as parallel multiple context-free grammars (PMCFGs). In particular, it supports efficient dynamic rule additions to the grammar at any moment. The algorithm is designed in a fully incremental way, allowing to resume parsing with additional tokens without restarting the parse process, and can predict possible next tokens. Additionally, we handle constraints on the token following a rule. This allows for grammatically correct English indefinite articles when working with word tokens. It can also represent typical operations for scannerless parsing such as maximal matches when working with character tokens. Our long-term goal is to computerize a large library of existing mathematical knowledge using the new parser, starting from natural language input as found in textbooks or in the papers collected by the digital mathematical library (DML) projects around the world. In this paper, we present the algorithmic ideas behind our approach, give a short overview of the implementation, and present some efficiency results. The new parser is available at http://www.tigen.org/kevin.kofler/fmathl/dyngenpar/ .