Querying XML documents by dynamic shredding

  • Authors:
  • Hui Zhang;Frank Wm. Tompa

  • Affiliations:
  • University of Manitoba Winnipeg MB, Canada;University of Waterloo Waterloo ON, Canada

  • Venue:
  • Proceedings of the 2004 ACM symposium on Document engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the wide adoption of XML as a standard data representation and exchange format querying XML documents becomes increasingly important. However relational database systems constitute a much more mature technology than what is available for native storage of XML. To bridge the gap one way to manage XML data is to use a commercial relational database system. In this approach users typically first ``shred'' their documents by isolating what they predict to be meaningful fragments then store the individual fragments according to some relational schema and later translate each XML query (e.g. expressed in W3C's XQuery) to SQL queries expressed against the shredded documents. In this paper we propose an alternative approach that builds on relational database technology but shreds XML documents dynamically. This avoids many of the problems in maintaining document order and reassembling compound data from its fragments. We then present an algorithm to translate a significant subset of XQuery into an extended relational algebra that includes operators defined for the structured text datatype. This algorithm can be used as the basis of a sound translation from XQuery to SQL and the starting point for query optimization which is required for XML to be supported by relational database technology.