Visualization and evaluation of clusters for exploratory analysis of gene expression data

  • Authors:
  • Ju Han Kim;Isaac S. Kohane;Lucila Ohno-Machado

  • Affiliations:
  • SNUBI: Seoul National University Biomedical Informatics, Seoul National University School of Medicine, Seoul, Republic of Korea and Children's Hospital Informatics Program, The Children's Hospital ...;Children's Hospital Informatics Program, The Children's Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA;Decision Systems Group, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St., Boston, MA

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering algorithms have been shown to be useful to explore large-scale gene expression profiles. Visualization and objective evaluation of clusters are two important considerations when users are selecting different clustering algorithms, but they are often overlooked. The developments of a framework and software tools that implement comprehensive data visualization and objective measures of cluster quality are crucial In this paper, we describe a theoretical framework and formalizations for consistently developing clustering algorithms. A new clustering algorithm was developed within the proposed framework. We demonstrate that a theoretically sound principle can be uniformly applied to the developments of cluster-optimization function, comprehensive data-visualization strategy, and objective cluster-evaluation measures as well as actual implementation of the principle. Cluster consistency and quality measures of the algorithm are rigorously evaluated against those of popular clustering algorithms for gene expression data analysis (K-means and self-organizing maps), in four data sets, yielding promising results.