Adequate Condensed Representations of Patterns

  • Authors:
  • Arnaud Soulet;Bruno Crémilleux

  • Affiliations:
  • LI, Université François Rabelais de Tours, Blois, France F-41029;GREYC-CNRS, Université de Caen, Caen Cédex, France F-14032

  • Venue:
  • ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Patterns are at the core of the discovery of a lot of knowledge from data but their uses are limited due to their huge number and their mining cost. During the last decade, many works addressed the concept of condensed representation w.r.t. frequency queries. Such representations are several orders of magnitude smaller than the size of the whole collections of patterns, and also enable us to regenerate the frequency information of any pattern. Equivalence classes, based on the Galois closure, are at the core of the pattern condensed representations. However, in real-world applications, interestingness of patterns is evaluated by various many other user-defined measures (e.g., confidence, lift, minimum). To the best of our knowledge, these measures have received very little attention. The Galois closure is appropriate to frequency based measures but unfortunately not to other measures.