Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering

Authors:
Hyuk Cho;Inderjit S. Dhillon
Affiliations:
-;-
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2008

Citing 13
Cited 12

The NP-completeness column: An ongoing guide

Journal of Algorithms
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Enhanced Biclustering on Expression Data

BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Iterative Clustering of High Dimensional Text Data Augmented by Local Search

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Multiclass Spectral Clustering

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Computational cluster validation in post-genomic data analysis

Bioinformatics
A systematic comparison and evaluation of biclustering methods for gene expression data

Bioinformatics
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation

The Journal of Machine Learning Research

A scalable framework for discovering coherent co-clusters in noisy data

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Approximation algorithms for tensor clustering

ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
PAC-Bayesian Analysis of Co-clustering and Beyond

The Journal of Machine Learning Research
DICLENS: Divisive Clustering Ensemble with Automatic Cluster Number

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Scalable co-clustering algorithms

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Data transformation for sum squared residue

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Bi-clustering gene expression data using co-similarity

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Ensemble methods for biclustering tasks

Pattern Recognition
Situation-Aware on mobile phone using co-clustering: algorithms and extensions

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
SC³: Triple Spectral Clustering-Based Consensus Clustering Framework for Class Discovery from Cancer Gene Expression Profiles

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Post-processing strategies for improving local gene expression pattern analysis

International Journal of Data Mining and Bioinformatics
A Probabilistic Latent Semantic Analysis Model for Coclustering the Mouse Brain Atlas

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is a consensus in microarray analysis that identifying potential local patterns, characterized by coherent groups of genes and conditions, may shed light on the discovery of previously undetectable biological cellular processes of genes as well as macroscopic phenotypes of related samples. In order to simultaneously cluster genes and conditions, we have previously developed a fast co-clustering algorithm, Minimum Sum-Squared Residue Co-clustering (MSSRCC), which employs an alternating minimization scheme and generates what we call co-clusters in a checkerboard structure. In this paper, we propose specific strategies that enable MSSRCC to escape poor local minima and resolve the degeneracy problem in partitional clustering algorithms. The strategies include binormalization, deterministic spectral initialization, and incremental local search. We assess the effects of various strategies on both synthetic gene expression datasets and real human cancer microarrays and provide empirical evidence that MSSRCC with the proposed strategies performs better than existing co-clustering and clustering algorithms. In particular, the combination of all the three strategies leads to the best performance. Furthermore, we illustrate coherence of the resulting co-clusters in a checkerboard structure, where genes in a co-cluster manifest the phenotype structure of corresponding specific samples, and evaluate the enrichment of functional annotations in Gene Ontology (GO).