Dependency-preserving normalization of relational and XML data

  • Authors:
  • Solmaz Kolahi

  • Affiliations:
  • Department of Computer Science, University of Toronto, 10 King's College road, Toronto, ON, Canada, M5S 3G4

  • Venue:
  • Journal of Computer and System Sciences
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Having a database design that avoids redundant information and update anomalies is the main goal of normalization techniques. Ideally, data as well as constraints should be preserved. However, this is not always achievable: while BCNF eliminates all redundancies, it may not preserve constraints, and 3NF, which achieves dependency preservation, may not always eliminate all redundancies. Our first goal is to investigate how much redundancy 3NF tolerates in order to achieve dependency preservation. We apply an information-theoretic measure and show that only prime attributes admit redundant information in 3NF, but their information content may be arbitrarily low. Then we study the possibility of achieving both redundancy elimination and dependency preservation by a hierarchical representation of relational data in XML. We provide a characterization of cases when an XML normal form called XNF guarantees both. Finally, we deal with dependency preservation in XML and show that like in the relational case, normalizing XML documents to achieve non-redundant data can result in losing constraints.