Covering indexes for XML queries: bisimulation - simulation = negation

  • Authors:
  • Prakash Ramanan

  • Affiliations:
  • Department of Computer Science, Wichita State University, Wichita, KS

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tree Pattern Queries (TPQ), Branching Path Queries (BPQ), and Core XPath (CXPath) are subclasses of the XML query language XPath, TPQ ⊂ BPQ ⊂ CX Path ⊂ X Path. Let TPQ = TPQ+ ⊂ BPQ+ ⊂ CX Path+ ⊂ X Path+ denote the corresponding subclasses, consisting of queries that do not involve the boolean negation operator not in their predicates. Simulation and bisimulation are two different binary relations on graph vertices that have previously been studied in connection with some of these classes. For instance, TPQ queries can be minimized using simulation. Most relevantly, for an XML document, its bisimulation quotient is the smallest index that covers (i.e., can be used to answer) all BPQ queries. Our results are as follows: • A CXPath+ query can be evaluated on an XML document by computing the simulation of the query tree by the document graph. • For an XML document, its simulation quotient is the smallest covering index for BPQ+. This, together with the previously-known result stated above, leads to the following: For BPQ covering indexes of XML documents, Bisimulation - Simulation = Negation. • For an XML document, its simulation quotient, with the idref edges ignored throughout, is the smallest covering index for TPQ. For any XML document, its simulation quotient is never larger than its bisimulation quotient; in some instances, it is exponentially smaller. Our last two results show that disallowing negation in the queries could substantially reduce the size of the smallest covering index.