Mining formal concepts with a bounded number of exceptions from transactional data

  • Authors:
  • Jérémy Besson;Céline Robardet;Jean-François Boulicaut

  • Affiliations:
  • INSA Lyon, LIRIS CNRS FRE 2672, Villeurbanne, France;INSA Lyon, PRISMA, Villeurbanne, France;INSA Lyon, LIRIS CNRS FRE 2672, Villeurbanne, France

  • Venue:
  • KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are designing new data mining techniques on boolean contexts to identify a priori interesting bi-sets (i.e., sets of objects or transactions associated to sets of attributes or items). A typical important case concerns formal concept mining (i.e., maximal rectangles of true values or associated closed sets by means of the so-called Galois connection). It has been applied with some success to, e.g., gene expression data analysis where objects denote biological situations and attributes denote gene expression properties. However in such real-life application domains, it turns out that the Galois association is a too strong one when considering intrinsically noisy data. It is clear that strong associations that would however accept a bounded number of exceptions would be extremely useful. We study the new pattern domain of α/β concepts, i.e., consistent maximal bi-sets with less than α false values per row and less than β false values per column. We provide a complete algorithm that computes all the α/β concepts based on the generation of concept unions pruned thanks to anti-monotonic constraints. An experimental validation on synthetic data is given. It illustrates that more relevant associations can be discovered in noisy data. We also discuss a practical application in molecular biology that illustrates an incomplete but quite useful extraction when all the concepts that are needed beforehand can not be discovered.