Identification of biomarkers for prostate cancer prognosis using a novel two-step cluster analysis

  • Authors:
  • Xin Chen;Shizhong Xu;Yipeng Wang;Michael McClelland;Zhenyu Jia;Dan Mercola

  • Affiliations:
  • Department of Pathology and Laboratory Medicine, University of California, Irvine;Department of Botany and Plant Sciences, University of California, Riverside;AltheaDx Inc., San Diego;Department of Pathology and Laboratory Medicine, University of California, Irvine and Vaccine Research Institute of San Diego;Department of Pathology and Laboratory Medicine, University of California, Irvine;Department of Pathology and Laboratory Medicine, University of California, Irvine

  • Venue:
  • PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Prognosis of Prostate cancer is challenging due to incomplete assessment by clinical variables such as Gleason score, metastasis stage, surgical margin status, seminal vesicle invasion status and preoperative prostate-specific antigen level. The whole-genome gene expression assay provides us with opportunities to identify molecular indicators for predicting disease outcomes. However, cell composition heterogeneity of the tissue samples usually generates inconsistent results for cancer profile studies. We developed a two-step strategy to identify prognostic biomarkers for prostate cancer by taking into account the variation due to mixed tissue samples. In the first step, an unsupervised EM clustering analysis was applied to each gene to cluster patient samples into subgroups based on the expression values of the gene. In the second step, genes were selected based on χ2 correlation analysis between the cluster indicators obtained in the first step and the observed clinical outcomes. Two simulation studies showed that the proposed method identified 30% more prognostic genes than the traditional differential expression analysis methods such as SAM and LIMMA. We also analyzed a real prostate cancer expression data set using the new method and the traditional methods. The pathway assay showed that the genes identified with the new method are significantly enriched by prostate cancer relevant pathways such as the wnt signaling pathway and TGF-β signaling pathway. Nevertheless, these genes were not detected by the traditional methods.