Finding informative genes for prostate cancer: a general framework of integrating heterogeneous sources

Authors:
Liang Ge;Jing Gao;Nan Du;Aidong Zhang
Affiliations:
State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo
Venue:
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Year:
2012

Citing 5
Cited 0

Matrix multiplication via arithmetic progressions

Journal of Symbolic Computation - Special issue on computational algebraic complexity
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Finding Informative Genes from Multiple Microarray Experiments: A Graph-based Consensus Maximization Model

BIBM '11 Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding informative genes for prostate cancer has always been an important topic in cancer study. With the widespread use of genomic analysis and microarray experiments, a large number of genes can be analyzed efficiently to find the informative ones based on high-throughput microarray experiments [5--9]. On the other hand, based on clinical studies, several genes have already been identified to be important in prostate cancer development and progression [23--28]. These research results come from heterogeneous sources, with different formats, and expressing different perspectives of the problem of finding informative genes for prostate cancer. In this work, we are aiming to find the informative genes for prostate cancer by utilizing these heterogeneous sources of information from various research progresses. We propose a general framework that encodes various heterogeneous sources including ranked lists of informative genes [5--9], microarray expression data [5--9] and important genes identified by [23--28]. The proposed framework estimates the conditional probability of a gene being informative and ranks the genes by this probability. The estimation of such probability is formulated as an optimization problem, where we propose an efficient iterative algorithm to solve the optimization problem. Furthermore, we show that the problem formulation is convex and the iterative algorithm converges to the global optimal value. Extensive experiments show that the utilization of heterogeneous information is very helpful in finding informative genes and the proposed method outperforms many other baseline methods.