A nonmonotone line search technique for Newton's method
SIAM Journal on Numerical Analysis
The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem
SIAM Journal on Optimization
Nonmonotone Spectral Projected Gradient Methods on Convex Sets
SIAM Journal on Optimization
Choosing Multiple Parameters for Support Vector Machines
Machine Learning
A Nonmonotone Line Search Technique and Its Application to Unconstrained Optimization
SIAM Journal on Optimization
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
Multiple kernel learning, conic duality, and the SMO algorithm
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning the Kernel with Hyperkernels
The Journal of Machine Learning Research
Large Scale Multiple Kernel Learning
The Journal of Machine Learning Research
Multiclass multiple kernel learning
Proceedings of the 24th international conference on Machine learning
Localized multiple kernel learning
Proceedings of the 25th international conference on Machine learning
Multi-class Discriminant Kernel Learning via Convex Programming
The Journal of Machine Learning Research
Learning subspace kernels for classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
More generality in efficient multiple kernel learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Variable Sparsity Kernel Learning
The Journal of Machine Learning Research
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Efficient hyperkernel learning using second-order cone programming
IEEE Transactions on Neural Networks
Online multi-modal distance learning for scalable multimedia retrieval
Proceedings of the sixth ACM international conference on Web search and data mining
Hi-index | 0.00 |
Multiple Kernel Learning (MKL) aims to learn the kernel in an SVM from training data. Many MKL formulations have been proposed and some have proved effective in certain applications. Nevertheless, as MKL is a nascent field, many more formulations need to be developed to generalize across domains and meet the challenges of real world applications. However, each MKL formulation typically necessitates the development of a specialized optimization algorithm. The lack of an efficient, general purpose optimizer capable of handling a wide range of formulations presents a significant challenge to those looking to take MKL out of the lab and into the real world. This problem was somewhat alleviated by the development of the Generalized Multiple Kernel Learning (GMKL) formulation which admits fairly general kernel parameterizations and regularizers subject to mild constraints. However, the projected gradient descent GMKL optimizer is inefficient as the computation of the step size and a reasonably accurate objective function value or gradient direction are all expensive. We overcome these limitations by developing a Spectral Projected Gradient (SPG) descent optimizer which: a) takes into account second order information in selecting step sizes; b) employs a non-monotone step size selection criterion requiring fewer function evaluations; c) is robust to gradient noise, and d) can take quick steps when far away from the optimum. We show that our proposed SPG-GMKL optimizer can be an order of magnitude faster than projected gradient descent on even small and medium sized datasets. In some cases, SPG-GMKL can even outperform state-of-the-art specialized optimization algorithms developed for a single MKL formulation. Furthermore, we demonstrate that SPG-GMKL can scale well beyond gradient descent to large problems involving a million kernels or half a million data points. Our code and implementation are available publically.