Clustering analysis of microarray gene expression data by splitting algorithm

Authors:
Ruye Wang;Lucas Scharenbroich;Christopher Hart;Barbara Wold;Eric Mjolsness
Affiliations:
Engineering Department, Harvey Mudd College, Claremont, CA and Jet Propulsion Laboratory, M/S 126-347, 4800 Oak Grove Dr., Pasadena, CA;Jet Propulsion Laboratory, M/S 126-347, 4800 Oak Grove Dr., Pasadena, CA;California Institute of Technology, M/C 156-29, Pasadena, CA;California Institute of Technology, M/C 156-29, Pasadena, CA;Institute for Genomics and Bioinformatics, School of Informatlon & Computer Science, University of California, Irvine, CA
Venue:
Journal of Parallel and Distributed Computing - High-performance computational biology
Year:
2003

Citing 1
Cited 2

Implementation of algorithms for maximum matching on nonbipartite graphs.

Implementation of algorithms for maximum matching on nonbipartite graphs.

Efficient two dimensional clustering of microarray gene expression data by means of hybrid similarity measure

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
A semi-supervised hierarchical approach: two-dimensional clustering of microarray gene expression data

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.03

Visualization

Abstract

A clustering method based on recursive bisection is introduced for analyzing microarray gene expression data. Either or both dimensions for the genes and the samples of a given microarray dataset can be classified in an unsupervised fashion. Alternatively, if certain prior knowledge of the genes or samples is available, a supervised version of the clustering analysis can also be carried out. Either approach may be used to generate a partial or complete binary hierarchy, the dendrogram, showing the underlying structure of the dataset. Compared to other existing clustering methods used for microarray data analysis (such as hierarchical and K-means), the method presented here has the advantage of much improved computational efficiency while retaining effective separation of data clusters under a distance metric, a straightforward parallel implementation, and useful extraction and presentation of biological information. Clustering results of both synthesized and experimental microarray data are presented to demonstrate the performance of the algorithm.