Proceedings of the sixth annual international conference on Computational biology
Using the Fisher Kernel Method to Detect Remote Protein Homologies
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Text classification using string kernels
The Journal of Machine Learning Research
Boosting as a Regularized Path to a Maximum Margin Classifier
The Journal of Machine Learning Research
Profile-Based String Kernels for Remote Homology Detection and Motif Extraction
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Fast String Kernels using Inexact Matching for Protein Sequences
The Journal of Machine Learning Research
Semi-supervised protein classification using cluster kernels
Bioinformatics
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
Trust Region Newton Method for Logistic Regression
The Journal of Machine Learning Research
A coordinate gradient descent method for nonsmooth separable minimization
Mathematical Programming: Series A and B
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines
The Journal of Machine Learning Research
Large-scale sparse logistic regression
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Hash Kernels for Structured Data
The Journal of Machine Learning Research
The SHOGUN Machine Learning Toolbox
The Journal of Machine Learning Research
Mining recent temporal patterns for event detection in multivariate time series data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient evaluation of large sequence kernels
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A temporal pattern mining approach for classifying electronic health record data
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Biological Sequence Classification with Multivariate String Kernels
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
We present a framework for discriminative sequence classification where linear classifiers work directly in the explicit high-dimensional predictor space of all subsequences in the training set (as opposed to kernel-induced spaces). This is made feasible by employing a gradient-bounded coordinate-descent algorithm for efficiently selecting discriminative subsequences without having to expand the whole space. Our framework can be applied to a wide range of loss functions, including binomial log-likelihood loss of logistic regression and squared hinge loss of support vector machines. When applied to protein remote homology detection and remote fold recognition, our framework achieves comparable performance to the state-of-the-art (e.g., kernel support vector machines). In contrast to state-of-the-art sequence classifiers, our models are simply lists of weighted discriminative subsequences and can thus be interpreted and related to the biological problem -- a crucial requirement for the bioinformatics and medical communities.