Frequent Closures as a Concise Representation for Binary Data Mining

  • Authors:
  • Jean-Francois Boulicaut;Artur Bykowski

  • Affiliations:
  • -;-

  • Venue:
  • PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent set discovery from binary data is an important problem in data mining. It concerns the discovery of a concise representation of large tables from which descriptive rules can be derived, e.g., the popular association rules. Our work concerns the study of two representations, namely frequent sets and frequent closures. N. Pasquier and colleagues designed the close algorithm that provides frequent sets via the discovery of frequent closures. When one mines highly correlated data, apriori-based algorithms clearly fail while close remains tractable. We discuss our implementation of close and the experimental evidence we got from two real-life binary data mining processes. Then, we introduce the concept of almost-closure (generation of every frequent set from frequent almost-closures remains possible but with a bounded error on frequency). To the best of our knowledge, this is a new concept and, here again, we provide some experimental evidence of its add-value.