On Codd Families of Keys over Incomplete Relations

  • Authors:
  • Sven Hartmann;Uwe Leck;Sebastian Link

  • Affiliations:
  • -;-;-

  • Venue:
  • The Computer Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keys allow a database management system to uniquely identify tuples in a database. Consequently, the class of keys is of great significance for almost all data processing tasks. In the relational model of data, keys have received considerable interest and are well understood. However, for efficient means of data processing most commercial relational database systems deviate from the relational model. For example, tuples may contain only partial information in the sense that they contain so-called null values to represent incomplete information. Codd's principle of entity integrity says that every tuple of every relation must not contain a null value on any attribute of the primary key. Therefore, a key over partial relations enforces both uniqueness and totality of tuples on the attributes of the key. On the basis of these two requirements, we study the resulting class of keys over relations that permit occurrences of Zaniolo's null value ‘no-information’. We show that the interaction of this class of keys is different from the interaction of the class of keys over total relations. We establish a finite ground axiomatization, and an algorithm for deciding the associated implication problem in linear time. Further, we characterize Armstrong relations for an arbitrarily given sets of keys; that is, we give a sufficient and necessary condition for a partial relation to satisfy a key precisely when it is implied by a given set of keys. We also establish an algorithm that computes an Armstrong relation for an arbitrarily given set of keys. While the problem of finding an Armstrong relation for a given key set is precisely exponential in general, our algorithm returns an Armstrong relation whose size is at most quadratic in the size of a minimal Armstrong relation. Finally, we settle various questions related to the maximal size of a family of non-redundant key sets. Our results help to bridge the gap between the existing theory of database constraints and database practice.