Multiobjective Optimization in Bioinformatics and Computational Biology
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks
Proceedings of the 24th international conference on Machine learning
Simple, robust, scalable semi-supervised learning via expectation regularization
Proceedings of the 24th international conference on Machine learning
A survey of kernel and spectral methods for clustering
Pattern Recognition
International Journal of Data Mining and Bioinformatics
ECML '07 Proceedings of the 18th European conference on Machine Learning
A Unified String Kernel for Biology Sequence
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Semi-supervised learning using label mean
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Pulmonary nodule classification aided by clustering
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Image Denoising with Kernels Based on Natural Image Relations
The Journal of Machine Learning Research
Scaling up semi-supervised learning: an efficient and effective LLGC variant
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Classifying proteins using gapped Markov feature pairs
Neurocomputing
Semi-supervised Bayesian ARTMAP
Applied Intelligence
Nearest-neighbor classification using unlabeled data for real world image application
Proceedings of the international conference on Multimedia
Semi-supervised abstraction-augmented string kernel for multi-level bio-relation extraction
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Artificial Intelligence in Medicine
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving semi-supervised learning for labeled graphs
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Efficient semi-supervised learning on locally informative multiple graphs
Pattern Recognition
Sentiment classification based on supervised latent n-gram analysis
Proceedings of the 20th ACM international conference on Information and knowledge management
Detecting disease genes based on semi-supervised learning and protein-protein interaction networks
Artificial Intelligence in Medicine
A structural cluster kernel for learning on graphs
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient evaluation of large sequence kernels
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Transductive multi-label ensemble classification for protein function prediction
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
RolX: structural role extraction & mining in large graphs
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
2D similarity kernels for biological sequence classification
Proceedings of the 11th International Workshop on Data Mining in Bioinformatics
WAW'12 Proceedings of the 9th international conference on Algorithms and Models for the Web Graph
Sentiment classification with supervised sequence embedding
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
A second order cone programming approach for semi-supervised learning
Pattern Recognition
Protein function prediction by integrating multiple kernels
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Biological Sequence Classification with Multivariate String Kernels
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Protein Function Prediction using Multi-label Ensemble Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 3.84 |
Motivation: Building an accurate protein classification system depends critically upon choosing a good representation of the input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data---examples with known 3D structures, organized into structural classes---whereas in practice, unlabeled data are far more plentiful. Results: In this work, we develop simple and scalable cluster kernel techniques for incorporating unlabeled data into the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels and outperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods and at the same time achieving far greater computationalefficiency. Availability: Source code is available at www.kyb.tuebingen.mpg.de/bs/people/weston/semiprot. The Spider matlab package is available at www.kyb.tuebingen.mpg.de/bs/people/spider Contact: jasonw@nec-labs.com Supplementary information: www.kyb.tuebingen.mpg.de/bs/people/weston/semiprot