Comparison of Complete and Elementless Native Storage of XML Documents

  • Authors:
  • Theo Harder;Christian Mathis;Karsten Schmidt

  • Affiliations:
  • University of Kaiserslautern, Germany;University of Kaiserslautern, Germany;University of Kaiserslautern, Germany

  • Venue:
  • IDEAS '07 Proceedings of the 11th International Database Engineering and Applications Symposium
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because XML documents tend to be very large, are accessed by declarative and navigational languages, and often are processed in a collaborative way using read/write transactions, their fine-grained storage and management in XML DBMSs is a must for which, in turn, a flexible and space-economic tree representation is mandatory. In this paper, we explore a variety of options to natively store, encode, and compress XML documents thereby preserving the full DBMS processing flexibility on the documents required by the various language models and usage characteristics. Important issues of our empirical study are related to node labeling, document container layout, indexing, as well as structure and content compression. Encoding and compression of XML documents with their complete structure leads to a space consumption of ~40% to ~60% compared to their plain representation, whereas structure virtualization (elementless storage) saves in the average more than 10%, in addition.