A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Making large-scale support vector machine learning practical
Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Linear Programming in Linear Time When the Dimension Is Fixed
Journal of the ACM (JACM)
A Simple Decomposition Method for Support Vector Machines
Machine Learning
Convergence of a Generalized SMO Algorithm for SVM Classifier Design
Machine Learning
Polynomial-Time Decomposition Algorithms for Support Vector Machines
Machine Learning
A note on the decomposition methods for support vector regression
Neural Computation
Training Support Vector Machines: an Application to Face Detection
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Lagrangian support vector machines
The Journal of Machine Learning Research
Convex Optimization
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines
The Journal of Machine Learning Research
Training support vector machines via SMO-type decomposition methods
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Successive overrelaxation for support vector machines
IEEE Transactions on Neural Networks
The analysis of decomposition methods for support vector machines
IEEE Transactions on Neural Networks
Improvements to the SMO algorithm for SVM regression
IEEE Transactions on Neural Networks
On the convergence of the decomposition method for support vector machines
IEEE Transactions on Neural Networks
Asymptotic convergence of an SMO algorithm without any assumptions
IEEE Transactions on Neural Networks
A formal analysis of stopping criteria of decomposition methods for support vector machines
IEEE Transactions on Neural Networks
A study on SMO-type decomposition methods for support vector machines
IEEE Transactions on Neural Networks
Generalized SMO-style decomposition algorithms
COLT'07 Proceedings of the 20th annual conference on Learning theory
Radial kernels and their reproducing kernel Hilbert spaces
Journal of Complexity
The Journal of Machine Learning Research
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
We present a general decomposition algorithm that is uniformly applicable to every (suitably normalized) instance of Convex Quadratic Optimization and efficiently approaches an optimal solution. The number of iterations required to be within ε of optimality grows linearly with 1/ε and quadratically with the number m of variables. The working set selection can be performed in polynomial time. If we restrict our considerations to instances of Convex Quadratic Optimization with at most k0 equality constraints for some fixed constant k0 plus some so-called box-constraints (conditions that hold for most variants of SVM-optimization), the working set is found in linear time. Our analysis builds on a generalization of the concept of rate certifying pairs that was introduced by Hush and Scovel. In order to extend their results to arbitrary instances of Convex Quadratic Optimization, we introduce the general notion of a rate certifying q-set. We improve on the results by Hush and Scovel (2003) in several ways. First our result holds for Convex Quadratic Optimization whereas the results by Hush and Scovel are specialized to SVM-optimization. Second, we achieve a higher rate of convergence even for the special case of SVM-optimization (despite the generality of our approach). Third, our analysis is technically simpler. We prove furthermore that the strategy for working set selection which is based on rate certifying sets coincides with a strategy which is based on a so-called "sparse witness of sub-optimality". Viewed from this perspective, our main result improves on convergence results by List and Simon (2004) and Simon (2004) by providing convergence rates (and by holding under more general conditions).