Machine Learning
Making large-scale support vector machine learning practical
Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Proceedings of the sixth annual international conference on Computational biology
A new discriminative kernel from probabilistic models
Neural Computation
Using the Fisher Kernel Method to Detect Remote Protein Homologies
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
3DString: a feature string kernel for 3D object classification on voxelized data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Large Scale Multiple Kernel Learning
The Journal of Machine Learning Research
On Relevant Dimensions in Kernel Feature Spaces
The Journal of Machine Learning Research
A brief survey on sequence classification
ACM SIGKDD Explorations Newsletter
Detecting adversarial advertisements in the wild
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient evaluation of large sequence kernels
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
2D similarity kernels for biological sequence classification
Proceedings of the 11th International Workshop on Data Mining in Bioinformatics
Biological Sequence Classification with Multivariate String Kernels
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve sufficiently high classification performances. In this work we study two recently proposed and successfully used kernels, namely the Spectrum kernel and the Weighted Degree kernel (WD). In particular, we suggest several extensions using Suffix Trees and modifications of an SMO-like SVM training algorithm in order to accelerate the training of the SVMs and their evaluation on test sequences. Our simulations show that for the spectrum kernel and WD kernel, large scale SVM training can be accelerated by factors of 20 and 4 times, respectively, while using much less memory (e.g. no kernel caching). The evaluation on new sequences is often several thousand times faster using the new techniques (depending on the number of Support Vectors). Our method allows us to train on sets as large as one million sequences.