EXsum: an XML summarization framework

  • Authors:
  • José de Aguiar Moraes Filho;Theo Härder

  • Affiliations:
  • University of Kaiserslautern, Kaiserslautern, Germany;University of Kaiserslautern, Kaiserslautern, Germany

  • Venue:
  • IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new framework for the summarization of XML document properties called EXsum (Element-wise XML summarization), which can capture statistical information of all important XPath axes related to (the nodes having) the same element name in a document. Compared to conventional summaries, cardinality estimates for a richer spectrum of XPath/XQuery expressions can be provided for query optimization. For the important class of queries consisting of one or two location steps only, even accurate cardinalities are computed. Besides adequate storage consumption, it provides fast access times which helps to keep the query optimization overhead low. Using a collection of XML documents embodying considerable structural variations, we have empirically analyzed the EXsum framework by running a large number of experiments in our XML native database management system called XTC. These evaluations clearly show the predominance of EXsum as to important aspects when compared to competitor approaches.