Prediction of Protein Functions from Protein Interaction Networks: A Naïve Bayes Approach
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Computers in Biology and Medicine
Information Sciences: an International Journal
Protein annotation from protein interaction networks and Gene Ontology
Journal of Biomedical Informatics
Hi-index | 3.84 |
Motivation: The increasing availability of complete genome sequences provides excellent opportunity for the further development of tools for functional studies in proteomics. Several experimental approaches and in silico algorithms have been developed to cluster proteins into networks of biological significance that may provide new biological insights, especially into understanding the functions of many uncharacterized proteins. Among these methods, the phylogenetic profiles method has been widely used to predict protein--protein interactions. It involves the selection of reference organisms and identification of homologous proteins. Up to now, no published report has systematically studied the effects of the reference genome selection and the identification of homologous proteins upon the accuracy of this method. Results: In this study, we optimized the phylogenetic profiles method by integrating phylogenetic relationships among reference organisms and sequence homology information to improve prediction accuracy. Our results revealed that the selection of the reference organisms set and the criteria for homology identification significantly are two critical factors for the prediction accuracy of this method. Our refined phylogenetic profiles method shows greater performance and potentially provides more reliable functional linkages compared with previous methods. Availability: The software (C, Perl) is available from the corresponding author. Contact:yxli@sibs.ac.cn; tlshi@sibs.ac.cn; zhaoaimin@cncbd.org.cn Supplementary information: There are three supplementarymaterials online, including related materials and results.