Integration of Literature with Heterogeneous Information for Genes Correlation Scoring

  • Authors:
  • Francesco Abate;Andrea Acquaviva;Elisa Ficarra;Enrico Macii

  • Affiliations:
  • Politecnico di Torino, Italy;Politecnico di Torino, Italy;Politecnico di Torino, Italy;Politecnico di Torino, Italy

  • Venue:
  • ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Bioinformatics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Determining the correlation between biomedical terms is a powerful instrument to help scientist research activity, both to understand experimental results and to design new ones. In particular, a great potential comes from the integration of the many heterogeneous information sources currently available on the Web. In this article we focus on the correlation between genes and biological processes. In this context, we present a methodology for integrating information from biomedical literature with other heterogeneous types of structured information. In particular, the information sources integrated in this work are PubMed abstracts, pathway databases, and NCI thesaurus definitions. The integration is performed at the semantic analysis level using a customized approach we developed to modulate the impact of the different sources on the correlation score. We report the results of a study concerning the impact of the information integration on the correlation score and of the user-level parameters we introduced to modulate the impact of pathway data or NCI definitions with respect to biomedical literature information, depending on the context of the search. To evaluate the methodology, we performed correlation measures on six biological processes and nine genes by comparing the results with and without the integration of pathways and NCI definitions.