Clustering analysis of microarray gene expression data by splitting algorithm

  • Authors:
  • Ruye Wang;Lucas Scharenbroich;Christopher Hart;Barbara Wold;Eric Mjolsness

  • Affiliations:
  • Engineering Department, Harvey Mudd College, Claremont, CA and Jet Propulsion Laboratory, M/S 126-347, 4800 Oak Grove Dr., Pasadena, CA;Jet Propulsion Laboratory, M/S 126-347, 4800 Oak Grove Dr., Pasadena, CA;California Institute of Technology, M/C 156-29, Pasadena, CA;California Institute of Technology, M/C 156-29, Pasadena, CA;Institute for Genomics and Bioinformatics, School of Informatlon & Computer Science, University of California, Irvine, CA

  • Venue:
  • Journal of Parallel and Distributed Computing - High-performance computational biology
  • Year:
  • 2003

Quantified Score

Hi-index 0.03

Visualization

Abstract

A clustering method based on recursive bisection is introduced for analyzing microarray gene expression data. Either or both dimensions for the genes and the samples of a given microarray dataset can be classified in an unsupervised fashion. Alternatively, if certain prior knowledge of the genes or samples is available, a supervised version of the clustering analysis can also be carried out. Either approach may be used to generate a partial or complete binary hierarchy, the dendrogram, showing the underlying structure of the dataset. Compared to other existing clustering methods used for microarray data analysis (such as hierarchical and K-means), the method presented here has the advantage of much improved computational efficiency while retaining effective separation of data clusters under a distance metric, a straightforward parallel implementation, and useful extraction and presentation of biological information. Clustering results of both synthesized and experimental microarray data are presented to demonstrate the performance of the algorithm.