Optimizing schema languages for XML: numerical constraints and interleaving

  • Authors:
  • Wouter Gelade;Wim Martens;Frank Neven

  • Affiliations:
  • School for Information Technology, Hasselt University and Transnational University of Limburg;School for Information Technology, Hasselt University and Transnational University of Limburg;School for Information Technology, Hasselt University and Transnational University of Limburg

  • Venue:
  • ICDT'07 Proceedings of the 11th international conference on Database Theory
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The presence of a schema offers many advantages in processing, translating, querying, and storage of XML data. Basic decision problems like equivalence, inclusion, and non-emptiness of intersection of schemas form the basic building blocks for schema optimization and integration, and algorithms for static analysis of transformations. It is thereby paramount to establish the exact complexity of these problems. Most common schema languages for XML can be adequately modeled by some kind of grammar with regular expressions at right-hand sides. In this paper, we observe that apart from the usual regular operators of union, concatenation and Kleene-star, schema languages also allow numerical occurrence constraints and interleaving operators. Although the expressiveness of these operators remain within the regular languages, their presence or absence has significant impact on the complexity of the basic decision problems. We present a complete overview of the complexity of the basic decision problems for DTDs, XSDs and Relax NG with regular expressions incorporating numerical occurrence constraints and interleaving. We also discuss chain regular expressions and the complexity of the schema simplification problem incorporating the new operators.