Iterative Cluster Analysis of Protein Interaction Data
Bioinformatics
Analysis of protein-protein interaction networks using random walks
Proceedings of the 5th international workshop on Bioinformatics
Protein function prediction based on patterns in biological networks
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Link prediction for annotation graphs using graph summarization
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Transductive multi-label ensemble classification for protein function prediction
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning for protein function prediction in protein-protein interaction networks
PRIB'13 Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics
Protein Function Prediction using Multi-label Ensemble Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
The recent advent of high-throughput methods has generated large amounts of gene interaction data. This has allowed the construction of genomewide networks. A significant number of genes in such networks remain uncharacterized and predicting the molecular function of these genes remains a major challenge. A number of existing techniques assume that genes with similar functions are topologically close in the network. Our hypothesis is that genes with similar functions observe similar annotation patterns in their neighborhood, regardless of the distance between them in the interaction network. We thus predict molecular functions of uncharacterized genes by comparing their functional neighborhoods to genes of known function. We propose a two-phase approach. First, we extract functional neighborhood features of a gene using Random Walks with Restarts. We then employ a KNN classifier to predict the function of uncharacterized genes based on the computed neighborhood features. We perform leave-one-out validation experiments on two S. cerevisiae interaction networks and show significant improvements over previous techniques. Our technique provides a natural control of the trade-off between accuracy and coverage of prediction. We further propose and evaluate prediction in sparse genomes by exploiting features from well-annotated genomes.