Sampling minimal frequent boolean (DNF) patterns

  • Authors:
  • Geng Li;Mohammed J. Zaki

  • Affiliations:
  • Rensselaer Polytechnic Institute, Troy, NY, USA;Rensselaer Polytechnic Institute, Troy, NY, USA

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We tackle the challenging problem of mining the simplest Boolean patterns from categorical datasets. Instead of complete enumeration, which is typically infeasible for this class of patterns, we develop effective sampling methods to extract a representative subset of the minimal Boolean patterns (in disjunctive normal form - DNF). We make both theoretical and practical contributions, which allow us to prune the search space based on provable properties. Our approach can provide a near-uniform sample of the minimal DNF patterns. We also show that the mined minimal DNF patterns are very effective when used as features for classification.