Database design for incomplete relations

  • Authors:
  • Mark Levene;George Loizou

  • Affiliations:
  • Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, U.K.;Department of Computer Science, Birkbeck College, Malet Street, London

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although there has been a vast amount of research in the area ofrelational database design, to our knowledge, there has been very little work that considers whether this theory is still valid when relations in the database may be incomplete. When relations are incomplete and thus contain null values the problem of whether satisfaction is additive arises. Additivity is the property of the equivalence of the satisfaction of a set of functional dependencies (FDs) F with the individual satisfaction of each member of F in an incomplete relation. It is well known that in general, satisfaction of FDs is not additive. Previously we have shown that satisfaction is additive if and only if the set of FDs is monodependent. We conclude that monodependence is a fundamental desirable property of a set of FDs when considering incomplete information in relational database design. We show that, when the set of FDs F either satifies the intersection property or the split-freeness property, then the problem of finding an optimum cover of F can be solved in polynomial time in the size of F; in general, this problem is known to be NP-complete. We also show that when F satisfies the split-freeness property then deciding whether there is a superkey of cardinality k or less can be solved in polynomial time in the size of F, since all the keys have the same cardinality. If F only satisfies the intersection property then this problem is NP-complete, as in the general case. Moreover, we show that when F either satisfies the intersection property or the split-freeness property then deciding whether an attribute is prime can be solved in polynomial time in the size of F; in general, this problem is known to be NP-complete. Assume that a relation schema R is an appropriate normal form with respect to a set of FDs F. We show that when F satisfies the intersection property then the notions of second normal form and third normal form are equivalent. We also show that when R is in Boyce-Codd Normal Form (BCNF), then F is monodependent if and only if either there is a unique key for R, or for all keys X for R, the cardinality of X is one less than the number of attributes associated with R. Finally, we tackle a long-standing problem in relational database theory by showing that when a set of FDs F over R satisfies the intersection property, it also satisfies the split-freeness property (i.e., is monodependent), if and only if every lossless join decomposition of R with respect to F is also dependecy preserving. As a corollary of this result we are able to show that when F satisfies the intersection property, it also satisfies the intersection property, it also satisfies the split-freeness property(i.e., is monodependent), if and only if every lossless join decomposition of R, which is in BCNF, is also dependency preserving. Our final result is that when F is monodependent, then there exists a unique optimum lossless join decomposition of R, which is in BCNF, and is also dependency preserving. Furthermore, this ultimate decomposition can be attained in polynomial time in the size of F.