The inference of protein–protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships

  • Authors:
  • Tetsuya Sato;Yoshihiro Yamanishi;Minoru Kanehisa;Hiroyuki Toh

  • Affiliations:
  • -;Centre de Géostatistique Ecole des Mines de Paris, 35 rue Saint-Honoré, 77305 Fontainebleau cedex, France;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University Fukuoka, Fukuoka 812-8582, Japan

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: The prediction of protein--protein interactions is currently an important issue in bioinformatics. The mirror tree method uses evolutionary information to predict protein--protein interactions. However, it has been recognized that predictions by the mirror tree method lead to many false positives. The incentive of our study was to solve this problem by improving the method of extracting the co-evolutionary information regarding the protein pairs. Results: We developed a novel method to predict protein--protein interactions from co-evolutionary information in the framework of the mirror tree method. The originality is the use of the projection operator to exclude the information about the phylogenetic relationships among the source organisms from the distance matrix. Each distance matrix was transformed into a vector for the operation. The vector is referred to as a 'phylogenetic vector'. We have proposed three ways to extract the phylogenetic information: (1) using the 16S rRNA from the same source organisms as the proteins under consideration, (2) averaging the phylogenetic vectors and (3) analyzing the principal components of the phylogenetic vectors. We examined the performance of the proposed methods to predict interacting protein pairs from Escherichia coli, using experimentally verified data. Our method was successful, and it drastically reduced the number of false positives in the prediction. Availability: The R script for the prediction of protein--protein interactions reported in this manuscript is available at http://timpani.genome.ad.jp/~proj/ Contact: sato@kuicr.kyoto-u.ac.jp Supplementary information: The information is also available at the same site as the R script.