Honey, I shrunk the XQuery!: an XML algebra optimization approach

  • Authors:
  • Xin Zhang;Bradford Pielech;Elke A. Rundesnteiner

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • Proceedings of the 4th international workshop on Web information and data management
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lot of work is being done in the database community on mapping of XML data into and out of relational database systems, specifically, the query processing over such data using XQuery. We discuss our solution, the XML Algebra Tree (XAT), as part of our larger XML management system called Rainbow.Rainbow uses XQuery to describe the loading and extracting of XML data into relational systems and also for the execution of queries against pre-defined XML views of that stored data. The XML algebra tree of the query against those views is merged with the queries that define the views to form a larger tree. Because the XML formatting operators are interleaved with the computation operators, this XAT must then be optimized before being translated into one or more SQL statements that can be executed on the database. SQL translation is composed of computation pushdown and SQL generation.The computation pushdown splits the tree into the XML-specific and SQL-doable operators, which is then going to be converted into SQL statements. However, the XAT after computation pushdown may contain unreferenced columns or unused operators. Leaving these operators in the tree will create unnecessarily large SQL statements and will slow down the overall execution.Our main contributions to XML query processing, outlined in this paper, are threefold. One, we describe an algebra based on XATs for modeling XQuery expressions. Two, we propose rewriting rules to optimize XQueries by XAT operator cancel out. Lastly, we show a cutting algorithm to remove unreferenced columns and operators from the trees. We have fully implemented the techniques discussed in this paper in the Rainbow system. A preliminary experimental study compares the performance of execution before and after operator cancel out and cutting.