Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Scannerless NSLR(1) parsing of programming languages
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Software—Practice & Experience
Software—Practice & Experience
An efficient context-free parsing algorithm
Communications of the ACM
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Even faster generalized LR parsing
Acta Informatica
The ASF+SDF Meta-environment: A Component-Based Language Development Environment
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Disambiguation Filters for Scannerless Generalized LR Parsers
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Parsing expression grammars: a recognition-based syntactic foundation
Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Coping with syntactic ambiguity or how to put the block in the box on the table
Computational Linguistics
Better extensibility through modular syntax
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Declarative, formal, and extensible syntax definition for aspectJ
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Automatic recursion engineering of reduction incorporated parsers
Science of Computer Programming
BRNGLR: a cubic Tomita-style GLR parsing algorithm
Acta Informatica
Context-aware scanning for parsing extensible languages
GPCE '07 Proceedings of the 6th international conference on Generative programming and component engineering
PYTHON 2.6 Reference Manual
General context-free recognition in less than cubic time
Journal of Computer and System Sciences
Hi-index | 0.00 |
Analysis and renovation of large software portfolios requires syntax analysis of multiple, usually embedded, languages and this is beyond the capabilities of many standard parsing techniques. The traditional separation between lexer and parser falls short due to the limitations of tokenization based on regular expressions when handling multiple lexical grammars. In such cases scannerless parsing provides a viable solution. It uses the power of context-free grammars to be able to deal with a wide variety of issues in parsing lexical syntax. However, it comes at the price of less efficiency. The structure of tokens is obtained using a more powerful but more time and memory intensive parsing algorithm. Scannerless grammars are also more non-deterministic than their tokenized counterparts, increasing the burden on the parsing algorithm even further. In this paper we investigate the application of the Right-Nulled Generalized LR parsing algorithm (RNGLR) to scannerless parsing. We adapt the Scannerless Generalized LR parsing and filtering algorithm (SGLR) to implement the optimizations of RNGLR. We present an updated parsing and filtering algorithm, called SRNGLR, and analyze its performance in comparison to SGLR on ambiguous grammars for the programming languages C, Java, Python, SASL, and C++. Measurements show that SRNGLR is on average 33% faster than SGLR, but is 95% faster on the highly ambiguous SASL grammar. For the mainstream languages C, C++, Java and Python the average speedup is 16%.