The complexity of XPath query evaluation and XML typing

  • Authors:
  • Georg Gottlob;Christoph Koch;Reinhard Pichler;Luc Segoufin

  • Affiliations:
  • Technische Universität Wien, Wien, Austria;Technische Universität Wien, Wien, Austria;Technische Universität Wien, Wien, Austria;INRIA, France

  • Venue:
  • Journal of the ACM (JACM)
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the complexity of two central XML processing problems. The first is XPath 1.0 query processing, which has been shown to be in PTIME in previous work. We prove that both the data complexity and the query complexity of XPath 1.0 fall into lower (highly parallelizable) complexity classes, while the combined complexity is PTIME-hard. Subsequently, we study the sources of this hardness and identify a large and practically important fragment of XPath 1.0 for which the combined complexity is LOGCFL-complete and, therefore, in the highly parallelizable complexity class NC2. The second problem is the complexity of validating XML documents against various typing schemes like Document Type Definitions (DTDs), XML Schema Definitions (XSDs), and tree automata, both with respect to data and to combined complexity. For data complexity, we prove that validation is in LOGSPACE and depends crucially on how XML data is represented. For the combined complexity, we show that the complexity ranges from LOGSPACE to LOGCFL, depending on the typing scheme.