Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis

  • Authors:
  • Weixiang Liu;Kehong Yuan;Datian Ye

  • Affiliations:
  • Research Center of Biomedical Engineering, Life Science Division, Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China;Research Center of Biomedical Engineering, Life Science Division, Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China;Research Center of Biomedical Engineering, Life Science Division, Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In microarray data analysis, each gene expression sample has thousands of genes and reducing such high dimensionality is useful for both visualization and further clustering of samples. Traditional principal component analysis (PCA) is a commonly used method which has problems. Nonnegative Matrix Factorization (NMF) is a new dimension reduction method. In this paper we compare NMF and PCA for dimension reduction. The reduced data is used for visualization, and clustering analysis via k-means on 11 real gene expression datasets. Before the clustering analysis, we apply NMF and PCA for reduction in visualization. The results on one leukemia dataset show that NMF can discover natural clusters and clearly detect one mislabeled sample while PCA cannot. For clustering analysis via k-means, NMF most typically outperforms PCA. Our results demonstrate the superiority of NMF over PCA in reducing microarray data.