Finding informative genes for prostate cancer: a general framework of integrating heterogeneous sources

  • Authors:
  • Liang Ge;Jing Gao;Nan Du;Aidong Zhang

  • Affiliations:
  • State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo

  • Venue:
  • Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding informative genes for prostate cancer has always been an important topic in cancer study. With the widespread use of genomic analysis and microarray experiments, a large number of genes can be analyzed efficiently to find the informative ones based on high-throughput microarray experiments [5--9]. On the other hand, based on clinical studies, several genes have already been identified to be important in prostate cancer development and progression [23--28]. These research results come from heterogeneous sources, with different formats, and expressing different perspectives of the problem of finding informative genes for prostate cancer. In this work, we are aiming to find the informative genes for prostate cancer by utilizing these heterogeneous sources of information from various research progresses. We propose a general framework that encodes various heterogeneous sources including ranked lists of informative genes [5--9], microarray expression data [5--9] and important genes identified by [23--28]. The proposed framework estimates the conditional probability of a gene being informative and ranks the genes by this probability. The estimation of such probability is formulated as an optimization problem, where we propose an efficient iterative algorithm to solve the optimization problem. Furthermore, we show that the problem formulation is convex and the iterative algorithm converges to the global optimal value. Extensive experiments show that the utilization of heterogeneous information is very helpful in finding informative genes and the proposed method outperforms many other baseline methods.