Making large-scale support vector machine learning practical
Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Convergence of a Generalized SMO Algorithm for SVM Classifier Design
Machine Learning
Interior-Point Methods for Massive Support Vector Machines
SIAM Journal on Optimization
Polynomial-Time Decomposition Algorithms for Support Vector Machines
Machine Learning
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
Maximum-Gain Working Set Selection for SVMs
The Journal of Machine Learning Research
Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems
The Journal of Machine Learning Research
The analysis of decomposition methods for support vector machines
IEEE Transactions on Neural Networks
On the convergence of the decomposition method for support vector machines
IEEE Transactions on Neural Networks
Asymptotic convergence of an SMO algorithm without any assumptions
IEEE Transactions on Neural Networks
A study on SMO-type decomposition methods for support vector machines
IEEE Transactions on Neural Networks
Computational Optimization and Applications
Faster directions for second order SMO
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Hi-index | 0.00 |
Training of support vector machines (SVMs) requires to solve a linearly constrained convex quadratic problem. In real applications, the number of training data may be very huge and the Hessian matrix cannot be stored. In order to take into account this issue, a common strategy consists in using decomposition algorithms which at each iteration operate only on a small subset of variables, usually referred to as the working set. Training time can be significantly reduced by using a caching technique that allocates some memory space to store the columns of the Hessian matrix corresponding to the variables recently updated. The convergence properties of a decomposition method can be guaranteed by means of a suitable selection of the working set and this can limit the possibility of exploiting the information stored in the cache. We propose a general hybrid algorithm model which combines the capability of producing a globally convergent sequence of points with a flexible use of the information in the cache. As an example of a specific realization of the general hybrid model, we describe an algorithm based on a particular strategy for exploiting the information deriving from a caching technique. We report the results of computational experiments performed by simple implementations of this algorithm. The numerical results point out the potentiality of the approach.