Accelerated training of max-margin Markov networks with kernels

Authors:
Xinhua Zhang;Ankan Saha;S. V. N. Vishwanathan
Affiliations:
Department of Computing Science, University of Alberta, Edmonton, AB T6G2E8, Canada;Department of Computer Science, University of Chicago, Chicago, IL 60637, USA;Department of Statistics and Computer Science, Purdue University, West Lafayette, IN 47907, USA
Venue:
Theoretical Computer Science
Year:
2014

Citing 15
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Convex Optimization

Convex Optimization
Smooth minimization of non-smooth functions

Mathematical Programming: Series A and B
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Excessive Gap Technique in Nonsmooth Convex Minimization

SIAM Journal on Optimization
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Structured Prediction, Dual Extragradient and Bregman Projections

The Journal of Machine Learning Research
Predicting Structured Data (Neural Information Processing)

Predicting Structured Data (Neural Information Processing)
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

The Journal of Machine Learning Research
Graphical Models, Exponential Families, and Variational Inference

Foundations and Trends® in Machine Learning
Cutting-plane training of structural SVMs

Machine Learning
Bundle Methods for Regularized Risk Minimization

The Journal of Machine Learning Research
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory
Mirror descent and nonlinear projected subgradient methods for convex optimization

Operations Research Letters

Quantified Score

Hi-index	5.23

Visualization

Abstract

Structured output prediction is an important machine learning problem both in theory and practice, and the max-margin Markov network (M^3N) is an effective approach. All state-of-the-art algorithms for optimizing M^3N objectives take at least O(1/@e) number of iterations to find an @e accurate solution. Nesterov [1] broke this barrier by proposing an excessive gap reduction technique (EGR) which converges in O(1/@e) iterations. However, it is restricted to Euclidean projections which consequently requires an intractable amount of computation for each iteration when applied to solve M^3N. In this paper, we show that by extending EGR to Bregman projection, this faster rate of convergence can be retained, and more importantly, the updates can be performed efficiently by exploiting graphical model factorization. Further, we design a kernelized procedure which allows all computations per iteration to be performed at the same cost as the state-of-the-art approaches.