Mining non-redundant information-theoretic dependencies between itemsets

  • Authors:
  • Michael Mampaey

  • Affiliations:
  • Dept. of Mathematics and Computer Science, University of Antwerp

  • Venue:
  • DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an information-theoretic framework for mining dependencies between itemsets in binary data. The problem of closure-based redundancy in this context is theoretically investigated, and we present both lossless and lossy pruning techniques. An efficient and scalable algorithm is proposed, which exploits the inclusion-exclusion principle for fast entropy computation. This algorithm is empirically evaluated through experiments on synthetic and real-world data.