An introduction to variational methods for graphical models
Learning in graphical models
Fast Approximate Energy Minimization via Graph Cuts
IEEE Transactions on Pattern Analysis and Machine Intelligence
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Learning structured prediction models: a large margin approach
Learning structured prediction models: a large margin approach
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Online large-margin training of dependency parsers
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Flexible text segmentation with structured multilabel classification
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Structured Prediction, Dual Extragradient and Bregman Projections
The Journal of Machine Learning Research
Solving multiclass support vector machines with LaRank
Proceedings of the 24th international conference on Machine learning
Structured prediction by joint kernel support estimation
Machine Learning
Temporal maximum margin Markov network
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Structured Learning and Prediction in Computer Vision
Foundations and Trends® in Computer Graphics and Vision
Extracting keyphrase set with high diversity and coverage using structural SVM
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Hi-index | 0.00 |
Tsochantaridis et al. (2005) proposed two formulations for maximum margin training of structured spaces: margin scaling and slack scaling. While margin scaling has been extensively used since it requires the same kind of MAP inference as normal structured prediction, slack scaling is believed to be more accurate and better-behaved. We present an efficient variational approximation to the slack scaling method that solves its inference bottleneck while retaining its accuracy advantage over margin scaling. We further argue that existing scaling approaches do not separate the true labeling comprehensively while generating violating constraints. We propose a new max-margin trainer PosLearn that generates violators to ensure separation at each position of a decomposable loss function. Empirical results on real datasets illustrate that PosLearn can reduce test error by up to 25% over margin scaling and 10% over slack scaling. Further, PosLearn violators can be generated more efficiently than slack violators; for many structured tasks the time required is just twice that of MAP inference.