Supporting bi-cluster interpretation in 0/1 data by means of local patterns

Authors:
Ruggero G. Pensa;Cé/line Robardet;Jean-Franç/ois Boulicaut
Affiliations:
(Correspd. Tel.: +33 4 72 43 70 24/ Fax: +33 4 72 43 87 13/ E-mail: ruggero.pensa@insa-lyon.fr) INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France;INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France;INSA Lyon, LIRIS CNRS UMR 5205, F-69621 Villeurbanne cedex, France
Venue:
Intelligent Data Analysis - Selected papers from IDA2005, Madrid, Spain
Year:
2006

Citing 14
Cited 2

Algorithms for clustering data

Algorithms for clustering data
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Computing iceberg concept lattices with TITANIC

Data & Knowledge Engineering
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Constraint-Based Mining and Inductive Databases: European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, ... / Lecture Notes in Artificial Intelligence)

Constraint-Based Mining and Inductive Databases: European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, ... / Lecture Notes in Artificial Intelligence)
Database Support for Data Mining Applications: Discovering Knowledge with Inductive Queries (Lecture Notes in Computer Science)

Database Support for Data Mining Applications: Discovering Knowledge with Inductive Queries (Lecture Notes in Computer Science)
Expert-guided subgroup discovery: methodology and application

Journal of Artificial Intelligence Research
Mining formal concepts with a bounded number of exceptions from transactional data

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
From local pattern mining to relevant bi-cluster characterization

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Mining frequent δ-free patterns in large databases

DS'05 Proceedings of the 8th international conference on Discovery Science

Application-Independent Feature Construction from Noisy Samples

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Mining fault-tolerant item sets using subset size occurrence distributions

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering or co-clustering techniques have been proved useful in many application domains. A weakness of these techniques remains the poor support for grouping characterization. As a result, interpreting clustering results and discovering knowledge from them can be quite hard. We consider potentially large Boolean data sets which record properties of objects and we assume the availability of a bi-partition which has to be characterized by means of a symbolic description. Our generic approach exploits collections of local patterns which satisfy some user-defined constraints in the data, and a measure of the accuracy of a given local pattern as a bi-cluster characterization pattern. We consider local patterns which are bi-sets, i.e., sets of objects associated to sets of properties. Two concrete examples are formal concepts (i.e., associated closed sets) and the so-called δ-bi-sets (i.e., an extension of formal concepts towards fault-tolerance). We introduce the idea of characterizing query which can be used by experts to support knowledge discovery from bi-partitions thanks to available local patterns. The added-value is illustrated on benchmark data and three real data sets: a medical data set and two gene expression data sets.