CDGMiner: A New Tool for the Identification of Disease Genes by Text Mining and Functional Similarity Analysis

  • Authors:
  • Fang Yuan;Yanhong Zhou

  • Affiliations:
  • Hubei Bioinformatics and Molecular Imaging Key Laboratory, Huazhong University of, Science and Technology, Wuhan, China 430074;Hubei Bioinformatics and Molecular Imaging Key Laboratory, Huazhong University of, Science and Technology, Wuhan, China 430074

  • Venue:
  • ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the post-genomic era, the identification of genes involved in human disease is one of the most important tasks. Disease phenotypes provide a window into the gene function. Several approaches to identify disease related genes based on function annotations have been presented in recent years. Most of them, starting from the function annotations of known genes associated with diseases, however, can not be used to identify genes for diseases without any known pathogenic genes or related function annotations. We have built a new system, CDGMiner, to predict genes associated with these diseases which lack detailed function annotations. CDGMiner is implemented mainly by two phases, text mining and functional similarity analysis. The performance of CDGMiner was tested with a set of 1506 genes involved in 1147 disease phenotypes derived from the OMIM database. Our results show that, on average, the target gene was in the top 13.60%, and the target gene was in the top 5% with a 40.70% chance. CDGMiner shows promising performance compared to other existing tools.