Database dependency discovery: a machine learning approach

  • Authors:
  • Peter A. Flach;Iztok Savnik

  • Affiliations:
  • Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK E-mail:Peter.Flach@bristol.ac.uk;Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia E-mail: Iztok.Savnik@fri.uni:lj.si

  • Venue:
  • AI Communications
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database dependencies, such as functional and multivalueddependencies, express the presence of structure in databaserelations, that can be utilised in the database design process. Thediscovery of database dependencies can be viewed as an inductionproblem, in which general rules (dependencies) are obtained fromspecific facts (the relation). This viewpoint has the advantage ofabstracting away as much as possible from the particulars of thedependencies. The algorithms in this paper are designed such thatthey can easily be generalised to other kinds of dependencies.Likein current approaches to computational induction such as inductivelogic programming, we distinguish between top-down algorithms andbottom-up algorithms. In a top-down approach, hypotheses aregenerated in a systematic way and then tested against the givenrelation. In a bottom-up approach, the relation is inspected inorder to see what dependencies it may satisfy or violate. We give asimple (but inefficient) top-down algorithm, a bi-directionalalgorithm, and a bottom-up algorithm. In the case of functionaldependencies, these algorithms have been implemented in the FDEPsystem and evaluated experimentally. The bottom-up algorithm is themost efficient of the three, and also outperforms other algorithmsfrom the literature.