Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An integrated probabilistic model for functional prediction of proteins
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Semi-supervised learning for structured output variables
ICML '06 Proceedings of the 23rd international conference on Machine learning
Hierarchical multi-label prediction of gene function
Bioinformatics
Transductive support vector machines for structured variables
Proceedings of the 24th international conference on Machine learning
Using the Gene Ontology hierarchy when predicting gene function
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Hi-index | 0.00 |
The problem of predicting protein function using Gene Ontology terms is a hierarchical classification problem. There are a variety of genomic data that are relevant to a protein's function: its sequence, its interactions with other proteins, expression of its gene, etc. Some of these sources (interactions and expression) are species-specific, while protein sequence is comparable across species, which complicates the task of integrating labeled data from a target species with labeled data from other species. We address this problem using the methodology of structured output learning, present a framework based on multi-view learning that is naturally suited for combining both types of data, and demonstrate its effectiveness in making predictions for proteins in S. cerevisiae and M. musculus. The code for our framework is available at http://strut.sourceforge.net.