A hierarchical model-based approach to co-clustering high-dimensional data

  • Authors:
  • Gianni Costa;Giuseppe Manco;Riccardo Ortale

  • Affiliations:
  • ICAR-CNR, Rende (CS), Italy;ICAR-CNR, Rende (CS), Italy;ICAR-CNR, Rende (CS), Italy

  • Venue:
  • Proceedings of the 2008 ACM symposium on Applied computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a hierarchical, model-based co-clustering framework for handling high-dimensional datasets. The technique views the dataset as a joint probability distribution over row and column variables. Our approach starts by clustering tuples in a dataset, where each cluster is characterized by a different probability distribution. Subsequently, the conditional distribution of attributes over tuples is exploited to discover natural co-clusters in the data. An intensive empirical evaluation highlights the effectiveness of our approach.