Journal of Combinatorial Theory Series A
Gene functional classification from heterogeneous data
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
A statistical framework for genomic data fusion
Bioinformatics
Protein function prediction via graph kernels
Bioinformatics
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Annotating proteins by mining protein interaction networks
Bioinformatics
A Graph-Based Semi-supervised Algorithm for Protein Function Prediction from Interaction Maps
Learning and Intelligent Optimization
Pairwise global alignment of protein interaction networks by matching neighborhood topology
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Protein function prediction based on patterns in biological networks
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Automatic parameter learning for multiple network alignment
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Transductive multi-label ensemble classification for protein function prediction
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Protein function prediction using weak-label learning
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Protein Function Prediction using Multi-label Ensemble Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Assigning biological functions to uncharacterized proteins is a fundamental problem in the postgenomic era. The increasing availability of large amounts of data on protein-protein interactions (PPIs) has led to the emergence of a considerable number of computational methods for determining protein function in the context of a network. These algorithms, however, treat each functional class in isolation and thereby often suffer from the difficulty of the scarcity of labeled data. In reality, different functional classes are naturally dependent on one another. We propose a new algorithm, Multi-label Correlated Semi-supervised Learning (MCSL), to incorporate the intrinsic correlations among functional classes into protein function prediction by leveraging the relationships provided by the PPI network and the functional class network. The guiding intuition is that the classification function should be sufficiently smooth on subgraphs where the respective topologies of these two networks are a good match. We encode this intuition as regularized learning with intraclass and interclass consistency, which can be understood as an extension of the graph-based learning with local and global consistency (LGC) method. Cross validation on the yeast proteome illustrates that MCSL consistently outperforms several state-of-the-art methods. Most notably, it effectively overcomes the problem associated with scarcity of label data. The supplementary files are freely available at http://sites.google.com/site/csaijiang/MCSL.