Shortest paths ranking methodology to identify alterations in PPI networks of complex diseases

  • Authors:
  • Sérgio Nery Simões;David Correa Martins-Jr;Helena Brentani;Ronaldo Fumio

  • Affiliations:
  • Universidade de São Paulo, São Paulo, SP;Universidade Federal do ABC, Santo André, SP;Universidade de São Paulo, São Paulo, SP;Universidade de São Paulo, São Paulo, SP

  • Venue:
  • Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Complex diseases are commonly caused by a combination of genetic alterations in several genes, which lead to abnormal propagation of signals along biological pathways. Assuming that DNA alterations can modulate, through biological pathways, differentially expressed genes, it is important to reveal and analyze gene interaction networks, as well as to identify their key players, in order to contribute to the understanding of such mechanisms occurring in complex diseases. In this way, integration of several data sources is an increasingly common trend since it has shown promise on revealing intricate molecular interactions present in complex diseases. Particularly in this context, many studies show the importance of methods that integrate data from genetic alterations, transcriptome and protein-protein interaction (PPI). Our hypothesis is that, starting with source and target genes set, expression data (in two different conditions: control and disease) and interactome data, our method integrates these data and constructs the network that has the genes/interactions more related to the corresponding complex disease. We proposed a methodology that determines potentially relevant genes and links (interactions) related to complex diseases in biological pathways by integrating association studies, gene expression data and human interactome (PPI). Our method consists in selecting the shortest paths between source genes and target genes that have the highest sum of absolute correlation values among the intermediate and target genes. Then, the intermediate genes are ranked by their frequencies in the selected paths. This is done for both expression profiles (control and disease) generating two networks. Next, the relative ranks of the genes/links in both networks are compared and those with the highest alterations are prioritized to compose the resulting alteration network. To validate our method, we adopted the schizophrenia as a case study and showed that the method is promising in recovering genes known to be related to such disease.