Training structural SVMs when exact inference is intractable

Authors:
Thomas Finley;Thorsten Joachims
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 18
Cited 32

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pseudo-boolean optimization

Discrete Applied Mathematics
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Supervised clustering with support vector machines

ICML '05 Proceedings of the 22nd international conference on Machine learning
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Integer linear programming inference for conditional random fields

ICML '05 Proceedings of the 22nd international conference on Machine learning
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Hierarchical classification: combining Bayes with SVM

ICML '06 Proceedings of the 23rd international conference on Machine learning
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Minimizing Nonsubmodular Functions with Graph Cuts-A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Support vector training of protein alignment models

RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Multiscale conditional random fields for image labeling

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Predicting diverse subsets using structural SVMs

Proceedings of the 25th international conference on Machine learning
Polyhedral outer approximations with application to natural language parsing

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Structure preserving embedding

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Predicting structured objects with support vector machines

Communications of the ACM - Scratch Programming for All
On structured output training: hard cases and an efficient alternative

Machine Learning
Max-Margin Weight Learning for Markov Logic Networks

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Large margin Boltzmann machines

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Structured prediction by joint kernel support estimation

Machine Learning
Optimal Weights for Convex Functionals in Medical Image Segmentation

ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
A structural support vector method for extracting contexts and answers of questions from online forums

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Multi-level structured models for document-level sentiment classification

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Dual decomposition for parsing with non-projective head automata

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
On parameter learning in CRF-based approaches to object class image segmentation

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Learning an interactive segmentation system

Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
A structural support vector method for extracting contexts and answers of questions from online forums

Information Processing and Management: an International Journal
Discriminative Models for Multi-Class Object Layout

International Journal of Computer Vision
Global Interactions in Random Field Models: A Potential Function Ensuring Connectedness

SIAM Journal on Imaging Sciences
Structured Learning and Prediction in Computer Vision

Foundations and Trends® in Computer Graphics and Vision
Fast Structured Prediction Using Large Margin Sigmoid Belief Networks

International Journal of Computer Vision
Large-margin learning of submodular summarization models

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Minimum-risk training of approximate CRF-based NLP systems

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Structured perceptron with inexact search

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Weakly Supervised Localization and Learning with Generic Knowledge

International Journal of Computer Vision
User-Centric Learning and Evaluation of Interactive Segmentation Systems

International Journal of Computer Vision
On learning higher-order consistency potentials for multi-class pixel labeling

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Structured image segmentation using kernelized features

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Group tracking: exploring mutual relations for multiple object tracking

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Learning and inference in probabilistic classifier chains with beam search

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Contextually guided semantic labeling and search for three-dimensional point clouds

International Journal of Robotics Research
Beam search algorithms for multilabel learning

Machine Learning
Markov Random Field modeling, inference & learning in computer vision & image understanding: A survey

Computer Vision and Image Understanding
Online hashing

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

While discriminative training (e.g., CRF, structural SVM) holds much promise for machine translation, image segmentation, and clustering, the complex inference these applications require make exact training intractable. This leads to a need for approximate training methods. Unfortunately, knowledge about how to perform efficient and effective approximate training is limited. Focusing on structural SVMs, we provide and explore algorithms for two different classes of approximate training algorithms, which we call undergenerating (e.g., greedy) and overgenerating (e.g., relaxations) algorithms. We provide a theoretical and empirical analysis of both types of approximate trained structural SVMs, focusing on fully connected pairwise Markov random fields. We find that models trained with overgenerating methods have theoretic advantages over undergenerating methods, are empirically robust relative to their undergenerating brethren, and relaxed trained models favor non-fractional predictions from relaxed predictors.