On the convergence of the coordinate descent method for convex differentiable minimization
Journal of Optimization Theory and Applications
Machine Learning
Making large-scale support vector machine learning practical
Advances in kernel methods
Semirings, Automata and Languages
Semirings, Automata and Languages
Automata: Theoretic Aspects of Formal Power Series
Automata: Theoretic Aspects of Formal Power Series
Incremental construction and maintenance of minimal finite-state automata
Computational Linguistics
Text classification using string kernels
The Journal of Machine Learning Research
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Kernel independent component analysis
The Journal of Machine Learning Research
Incremental construction of minimal acyclic finite-state automata
Computational Linguistics - Special issue on finite-state methods in NLP
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Rational Kernels: Theory and Algorithms
The Journal of Machine Learning Research
Core Vector Machines: Fast SVM Training on Very Large Data Sets
The Journal of Machine Learning Research
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
Sequence kernels for predicting protein essentiality
Proceedings of the 25th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
On sampling-based approximate spectral decomposition
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
This paper presents a novel application of automata algorithms to machine learning. It introduces the first optimization solution for support vector machines used with sequence kernels that is purely based on weighted automata and transducer algorithms, without requiring any specific solver. The algorithms presented apply to a family of kernels covering all those commonly used in text and speech processing or computational biology. We show that these algorithms have significantly better computational complexity than previous ones and report the results of large-scale experiments demonstrating a dramatic reduction of the training time, typically by several orders of magnitude.