Training structural svms with kernels using sampled cuts

Authors:
Chun-Nam John Yu;Thorsten Joachims
Affiliations:
Cornell University, Ithaca, NY, USA;Cornell University, Ithaca, NY, USA
Venue:
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2008

Citing 12
Cited 7

Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Efficient svm training using low-rank kernel representations

The Journal of Machine Learning Research
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Building Support Vector Machines with Reduced Classifier Complexity

The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Optimized cutting plane algorithm for support vector machines

Proceedings of the 25th international conference on Machine learning
Cutting-plane training of structural SVMs

Machine Learning
Support vector training of protein alignment models

RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology

Predicting structured objects with support vector machines

Communications of the ACM - Scratch Programming for All
Large-scale support vector learning with structural kernels

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Fast support vector machines for structural Kernels

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Extracting keyphrase set with high diversity and coverage using structural SVM

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Structural relationships for large-scale learning of answer re-ranking

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Structured image segmentation using kernelized features

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
View-Invariant action recognition using latent kernelized structural SVM

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discriminative training for structured outputs has found increasing applications in areas such as natural language processing, bioinformatics, information retrieval, and computer vision. Focusing on large-margin methods, the most general (in terms of loss function and model structure) training algorithms known to date are based on cutting-plane approaches. While these algorithms are very efficient for linear models, their training complexity becomes quadratic in the number of examples when kernels are used. To overcome this bottleneck, we propose new training algorithms that use approximate cutting planes and random sampling to enable efficient training with kernels. We prove that these algorithms have improved time complexity while providing approximation guarantees. In empirical evaluations, our algorithms produced solutions with training and test error rates close to those of exact solvers. Even on binary classification problems where highly optimized conventional training methods exist (e.g. SVM-light), our methods are about an order of magnitude faster than conventional training methods on large datasets, while remaining competitive in speed on datasets of medium size.