Parameter-Free Hierarchical Co-clustering by n-Ary Splits

Authors:
Dino Ienco;Ruggero G. Pensa;Rosa Meo
Affiliations:
Department of Computer Science, University of Torino, Turin, Italy I-10149;Department of Computer Science, University of Torino, Turin, Italy I-10149;Department of Computer Science, University of Torino, Turin, Italy I-10149
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Year:
2009

Citing 11
Cited 4

Document clustering using word clusters via the information bottleneck method

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Data mining: concepts and techniques

Data mining: concepts and techniques
Efficient Local Search in Conceptual Clustering

DS '01 Proceedings of the 4th International Conference on Discovery Science
Comparison of Three Objective Functions for Conceptual Clustering

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation

The Journal of Machine Learning Research
A hierarchical model-based approach to co-clustering high-dimensional data

Proceedings of the 2008 ACM symposium on Applied computing
Approximation algorithms for co-clustering

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Hierarchical co-clustering for web queries and selected URLs

WISE'07 Proceedings of the 8th international conference on Web information systems engineering

Annotated stochastic context free grammars for analysis and synthesis of proteins

EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Hierarchical co-clustering based on entropy splitting

Proceedings of the 21st ACM international conference on Information and knowledge management
Parameter-less co-clustering for star-structured heterogeneous data

Data Mining and Knowledge Discovery
Hierarchical co-clustering: off-line and incremental approaches

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering high-dimensional data is challenging. Classic metrics fail in identifying real similarities between objects. Moreover, the huge number of features makes the cluster interpretation hard. To tackle these problems, several co-clustering approaches have been proposed which try to compute a partition of objects and a partition of features simultaneously. Unfortunately, these approaches identify only a predefined number of flat co-clusters. Instead, it is useful if the clusters are arranged in a hierarchical fashion because the hierarchy provides insides on the clusters. In this paper we propose a novel hierarchical co-clustering, which builds two coupled hierarchies, one on the objects and one on features thus providing insights on both them. Our approach does not require a pre-specified number of clusters, and produces compact hierarchies because it makes n ***ary splits, where n is automatically determined. We validate our approach on several high-dimensional datasets with state of the art competitors.