Information dependencies

  • Authors:
  • Mehmet M. Dalkilic;Edward L. Roberston

  • Affiliations:
  • Indiana University Computer Science, Lindley Hall 215, Bloomington, Indiana;Indiana University Computer Science, Lindley Hall 215, Bloomington, Indiana

  • Venue:
  • PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper uses the tools of information theory to examine and reason about the information content of the attributes within a relation instance. For two sets of attributes X and Y, an information dependency measure (InD measure) characterizes the uncertainty remaining about the values for the set Y when the values for the set X are known. A variety of arithmetic inequalities (InD inequalities) are shown to hold among InD measures; InD inequalities hold in any relation instance. Numeric constraints (InD constraints) on InD measures, consistent with the InD inequalities, can be applied to relation instances. Remarkably, functional and multivalued dependencies correspond to setting certain constraints to zero, with Armstrong's axioms shown to be consequences of the arithmetic inequalities applied to constraints. As an analog of completeness, for any set of constraints consistent with the inequalities, we may construct a relation instance that approximates these constraints within any positive &egr;. InD measures suggest many valuable applications in areas such as data mining.