Improving molecular cancer class discovery through sparse non-negative matrix factorization

  • Authors:
  • Yuan Gao;George Church

  • Affiliations:
  • Department of Genetics, Harvard Medical School Boston, MA 02115, USA;Department of Genetics, Harvard Medical School Boston, MA 02115, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Identifying different cancer classes or subclasses with similar morphological appearances presents a challenging problem and has important implication in cancer diagnosis and treatment. Clustering based on gene-expression data has been shown to be a powerful method in cancer class discovery. Non-negative matrix factorization is one such method and was shown to be advantageous over other clustering techniques, such as hierarchical clustering or self-organizing maps. In this paper, we investigate the benefit of explicitly enforcing sparseness in the factorization process. Results: We report an improved unsupervised method for cancer classification by the use of gene-expression profile via sparse non-negative matrix factorization. We demonstrate the improvement by direct comparison with classic non-negative matrix factorization on the three well-studied datasets. In addition, we illustrate how to identify a small subset of co-expressed genes that may be directly involved in cancer. Contact:g1m1c1@receptor.med.harvard.edu, ygao@receptor.med.harvard.edu Supplementary information: http://arep.med.harvard.edu/snmf/supplement.htm