Stability-based validation of bicluster solutions

Authors:
Youngrok Lee;Jeonghwa Lee;Chi-Hyuck Jun
Affiliations:
Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang 790-784, Republic of Korea;Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang 790-784, Republic of Korea;Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang 790-784, Republic of Korea
Venue:
Pattern Recognition
Year:
2011

Citing 14
Cited 1

Algorithms for clustering data

Algorithms for clustering data
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Stability-based validation of clustering solutions

Neural Computation
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Computational cluster validation in post-genomic data analysis

Bioinformatics
Resampling Method for Unsupervised Estimation of Cluster Validity

Neural Computation
A systematic comparison and evaluation of biclustering methods for gene expression data

Bioinformatics
Computing the maximum similarity bi-clusters of gene expression data

Bioinformatics
Nonnegative Decompositions with Resampling for Improving Gene Expression Data Biclustering Stability

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Methods to bicluster validation and comparison in microarray data

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Clustering and metaclustering with nonnegative matrix decompositions

ECML'05 Proceedings of the 16th European conference on Machine Learning
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Ensemble methods for biclustering tasks

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

Bicluster analysis is an unsupervised learning method to detect homogeneous or uniquely characterized two-way subsets of objects and attributes from a data set. It is useful in finding groups that may not be found by the traditional cluster analysis and in interpreting the groups intuitively, especially for high-dimensional data sets. Because of these advantages, over the last few years, various biclustering algorithms have been developed and applied to bioinformatics and text mining area. However, research into validation of bicluster solutions is rare. We propose a new procedure of validating bicluster solutions by developing a stability index to measure the reproducibility of the solution under variation in the input data set. By generating random resample data sets from the input data set, obtaining bicluster solutions from them, and evaluating the expected agreement of the solutions to the bicluster solution for the original input data set, we quantify the stability of the bicluster solution. Experiments using three artificial data sets and two real gene expression data sets indicate that the proposed method is suitable to validate bicluster solutions.