Adequate condensed representations of patterns

  • Authors:
  • Arnaud Soulet;Bruno Crémilleux

  • Affiliations:
  • LI, Université François Rabelais de Tours, Blois, France 41029;GREYC-CNRS, Université de Caen, Caen Cedex, France 14032

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Patterns are at the core of the discovery of a lot of knowledge from data but their uses are limited due to their huge number and their mining cost. During the last decade, many works addressed the concept of condensed representation w.r.t. frequency queries. Such representations are several orders of magnitude smaller than the size of the whole collections of patterns, and also enable us to regenerate the frequency information of any pattern. In this paper, we propose a framework for condensed representations w.r.t. a large set of new and various queries named condensable functions based on interestingness measures (e.g., frequency, lift, minimum). Such condensed representations are achieved thanks to new closure operators automatically derived from each condensable function to get adequate condensed representations. We propose a generic algorithm Mic Mac to efficiently mine the adequate condensed representations. Experiments show both the conciseness of the adequate condensed representations and the efficiency of our algorithm.