Interior-Point Methods for Massive Support Vector Machines
SIAM Journal on Optimization
Convex Optimization
Smooth minimization of non-smooth functions
Mathematical Programming: Series A and B
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Training a Support Vector Machine in the Primal
Neural Computation
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
Optimized cutting plane algorithm for support vector machines
Proceedings of the 25th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
Proximal regularization for online and batch learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Cutting-plane training of structural SVMs
Machine Learning
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
Bundle Methods for Regularized Risk Minimization
The Journal of Machine Learning Research
NESVM: A Fast Gradient Method for Support Vector Machines
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Mirror descent and nonlinear projected subgradient methods for convex optimization
Operations Research Letters
Hi-index | 0.00 |
Optimizing multivariate performance measure is an important task in Machine Learning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs converge to an ε accurate solution in O(1/λε) iterations, where λ is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov's accelerated gradient method, can find an ε accurate solution in O* (min{1/ε, 1/√λε}) iterations. Computationally, our smoothing technique is also particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges significantly faster than CPMs without sacrificing generalization ability.