Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Convex Optimization
Smooth minimization of non-smooth functions
Mathematical Programming: Series A and B
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Excessive Gap Technique in Nonsmooth Convex Minimization
SIAM Journal on Optimization
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
A discriminative matching approach to word alignment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Structured Prediction, Dual Extragradient and Bregman Projections
The Journal of Machine Learning Research
Predicting Structured Data (Neural Information Processing)
Predicting Structured Data (Neural Information Processing)
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks
The Journal of Machine Learning Research
Graphical Models, Exponential Families, and Variational Inference
Foundations and Trends® in Machine Learning
Cutting-plane training of structural SVMs
Machine Learning
Bundle Methods for Regularized Risk Minimization
The Journal of Machine Learning Research
Factor graphs and the sum-product algorithm
IEEE Transactions on Information Theory
Mirror descent and nonlinear projected subgradient methods for convex optimization
Operations Research Letters
Hi-index | 5.23 |
Structured output prediction is an important machine learning problem both in theory and practice, and the max-margin Markov network (M^3N) is an effective approach. All state-of-the-art algorithms for optimizing M^3N objectives take at least O(1/@e) number of iterations to find an @e accurate solution. Nesterov [1] broke this barrier by proposing an excessive gap reduction technique (EGR) which converges in O(1/@e) iterations. However, it is restricted to Euclidean projections which consequently requires an intractable amount of computation for each iteration when applied to solve M^3N. In this paper, we show that by extending EGR to Bregman projection, this faster rate of convergence can be retained, and more importantly, the updates can be performed efficiently by exploiting graphical model factorization. Further, we design a kernelized procedure which allows all computations per iteration to be performed at the same cost as the state-of-the-art approaches.