Zigzag: a new algorithm for mining large inclusion dependencies in databases

  • Authors:
  • Fabien De Marchi;Jean-Marc Petit

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the relational model, inclusion dependencies (INDs)convey many information on data semantics. They generalizeforeign keys, which are very popular constraints inpractice. However, one seldom knows the set of satisfiedINDs in a database. The IND discovery problem in existingdatabases can be formulated as a data-mining problem.We underline in this article that the exploration of IND expressionsfrom most general (smallest) INDs to most specific(largest) INDs does not succeed whenever large INDshave to be discovered. To cope with this problem, we introducea new algorithm, called Zigzag , which combinesthe strength of levelwise algorithms (to find out some smallestINDs) with an optimistic criteria to jump more or lessto largest INDs. Preliminary tests, on synthetic databases,are presented and commented on. It is worth noting that themain result of this paper is general enough to be appliedto other data-mining problems, such as maximal frequentitemsets mining.