XML design for relational storage

  • Authors:
  • Solmaz Kolahi;Leonid Libkin

  • Affiliations:
  • University of Toronto;University of Edinburgh

  • Venue:
  • Proceedings of the 16th international conference on World Wide Web
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Design principles for XML schemas that eliminate redundancies and avoid update anomalies have been studied recently. Several normal forms, generalizing those for relational databases, have been proposed. All of them, however, are based on the assumption of anative XML storage, while in practice most of XML data is stored inrelational databases. In this paper we study XML design and normalization for relational storage of XML documents. To be able to relate and compare XML and relational designs, we use an information-theoretic framework that measures information content in relations and documents, with higher values corresponding to lower levels of redundancy. We show that most common relational storage schemes preserve the notion of being well-designed (i.e., anomalies- and redundancy-free). Thus,existing XML normal forms guarantee well-designed relational storagesas well. We further show that if this perfect option is not achievable, then a slight restriction on XML constraints guarantees a "second-best" relational design, according to possible values of the information-theoretic measure. We finally consider an edge-based relational representation of XML documents, and show that while it has similar information-theoretic properties with other relational representations, it can behave significantly worse in terms of enforcing integrity constraints.