Supporting bi-cluster interpretation in 0/1 data by means of local patterns

  • Authors:
  • Ruggero G. Pensa;Cé/line Robardet;Jean-Franç/ois Boulicaut

  • Affiliations:
  • (Correspd. Tel.: +33 4 72 43 70 24/ Fax: +33 4 72 43 87 13/ E-mail: ruggero.pensa@insa-lyon.fr) INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France;INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France;INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France

  • Venue:
  • Intelligent Data Analysis - Selected papers from IDA2005, Madrid, Spain
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering or co-clustering techniques have been proved useful in many application domains. A weakness of these techniques remains the poor support for grouping characterization. As a result, interpreting clustering results and discovering knowledge from them can be quite hard. We consider potentially large Boolean data sets which record properties of objects and we assume the availability of a bi-partition which has to be characterized by means of a symbolic description. Our generic approach exploits collections of local patterns which satisfy some user-defined constraints in the data, and a measure of the accuracy of a given local pattern as a bi-cluster characterization pattern. We consider local patterns which are bi-sets, i.e., sets of objects associated to sets of properties. Two concrete examples are formal concepts (i.e., associated closed sets) and the so-called δ-bi-sets (i.e., an extension of formal concepts towards fault-tolerance). We introduce the idea of characterizing query which can be used by experts to support knowledge discovery from bi-partitions thanks to available local patterns. The added-value is illustrated on benchmark data and three real data sets: a medical data set and two gene expression data sets.