Scalable co-clustering algorithms

  • Authors:
  • Bongjune Kwon;Hyuk Cho

  • Affiliations:
  • Biomedical Engineering, The University of Texas at Austin, Austin, TX;Computer Science, Sam Houston State University, Huntsville, TX

  • Venue:
  • ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Co-clustering has been extensively used in varied applications because of its potential to discover latent local patterns that are otherwise unapparent by usual unsupervised algorithms such as k-means Recently, a unified view of co-clustering algorithms, called Bregman co-clustering (BCC), provides a general framework that even contains several existing co-clustering algorithms, thus we expect to have more applications of this framework to varied data types However, the amount of data collected from real-life application domains easily grows too big to fit in the main memory of a single processor machine Accordingly, enhancing the scalability of BCC can be a critical challenge in practice To address this and eventually enhance its potential for rapid deployment to wider applications with larger data, we parallelize all the twelve co-clustering algorithms in the BCC framework using message passing interface (MPI) In addition, we validate their scalability on eleven synthetic datasets as well as one real-life dataset, where we demonstrate their speedup performance in terms of varied parameter settings.