TDX: a high-performance table-driven XML parser

  • Authors:
  • Wei Zhang;Robert A. van Engelen

  • Affiliations:
  • Florida State University, Tallahassee, FL;Florida State University, Tallahassee, FL

  • Venue:
  • Proceedings of the 44th annual Southeast regional conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents TDX, a table-driven XML parser. TDX combines parsing and validation into one pass to increase the performance of XML-based applications, such as Web services. The TDX approach is based on the observation that context-free grammars can be automatically derived from XML schema. We developed a parser construction tool to automatically construct TDX grammar productions from a schema. Grammar tokens are defined by the specific schema element names, attribute names, and text. Because most of the structural constraints in XML schema are cast as grammar rules, parsing and validation of XML instances are efficiently implemented. The results show that TDX is several times faster than DOM or SAX parsing with validation enabled.