Optimized XPath evaluation for schema-compressed XML data

  • Authors:
  • Stefan Böttcher;Rita Hartel;Stefan Heindorf

  • Affiliations:
  • University of Paderborn (Germany), Fürstenallee, Paderborn;University of Paderborn (Germany), Fürstenallee, Paderborn;University of Paderborn (Germany), Fürstenallee, Paderborn

  • Venue:
  • ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML has become the de facto standard for data exchange in enterprise information systems. But whenever XML data is stored or processed, e. g. in form of a DOM tree representation, the XML markup causes a huge blow-up of the memory consumption compared to the data, i. e., text and attribute values, contained in the XML document. In this paper, we present an optimized XPath query evaluation for XSDS, an XML compression approach based on removing information that is obsolete as this information can be derived from the existing XML Schema definition (XSD). Thereby, XSDS allows for storing and exchanging XML data in a space efficient and still queryable way. While previous papers have shown that XSDS generally reaches stronger compression ratios than other approaches like gzip, bzip2, and XMill and that XPath queries can be evaluated on XSDS compressed data, we show in this paper that when optimizing the query evaluation on XSDS compressed data by using the given schema information, we can speed up query evaluation by a factor of 13 reaching evaluation times that are more than 5 times faster than those of JAXP -- the standard Java XPath evaluator. The speed up was reached by avoiding the decompression of large parts of the structure while evaluating the query.