Finding disease similarity based on implicit semantic similarity
Journal of Biomedical Informatics
Computational Biology and Chemistry
Prioritizing disease genes by bi-random walk
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
An efficient genome fragment assembling using GA with neighborhood aware fitness function
Applied Computational Intelligence and Soft Computing - Special issue on Awareness Science and Engineering
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 3.84 |
Motivation: Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene–phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. Results: In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype–gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene–phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. Availability: The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip Contact: yongjin.li@gmail.com Supplementary information:Supplementary data are available at Bioinformatics online.