Regular expressions into finite automata
Theoretical Computer Science
Communications of the ACM
Compiler Construction
MFCS '00 Proceedings of the 25th International Symposium on Mathematical Foundations of Computer Science
Generators for High-Speed Front-Ends
Proceedings of the 2nd CCHSC Workshop on Compiler Compilers and High Speed Compilation
Properties of deterministic top down grammars
STOC '69 Proceedings of the first annual ACM symposium on Theory of computing
Proceedings of the 2002 ACM symposium on Document engineering
XML screamer: an integrated approach to high performance XML parsing, validation and deserialization
Proceedings of the 15th international conference on World Wide Web
TDX: a high-performance table-driven XML parser
Proceedings of the 44th annual Southeast regional conference
Parallel XML processing by work stealing
Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches
WS-AMUSE - web service architecture for multimedia services
Proceedings of the 30th international conference on Software engineering
A Parallel Approach to XML Parsing
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Hi-index | 0.00 |
Communication with XML often involves pre-agreed document types. In this paper, we propose an offline parser generation approach to enhance online processing performance for documents conforming to a given DTD. Our examination of DTDs and the languages they define demonstrates the existence of ambiguities. We present an algorithm that maps DTDs to deterministic context-free grammars defining the same languages. We prove the grammars to be iLL(1) and iLALR(1), making them suitable for standard parser generators. Our experiments show the superior performance of generated optimized parsers. Our results generalize from DTDs to XML schema specifications with certain restrictions, most notably the absence of namespaces, which exceed the scope of context-free grammars.