Discriminative unsupervised learning of structured predictors

Authors:
Linli Xu;Dana Wilkinson;Finnegan Southey;Dale Schuurmans
Affiliations:
University of Waterloo, Waterloo ON, Canada;University of Waterloo, Waterloo ON, Canada;University of Alberta, Edmonton AB, Canada;University of Alberta, Edmonton AB, Canada
Venue:
ICML '06 Proceedings of the 23rd international conference on Machine learning
Year:
2006

Citing 8
Cited 10

Statistical methods for speech recognition

Statistical methods for speech recognition
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On the algorithmic implementation of multiclass kernel-based vector machines

The Journal of Machine Learning Research
Convex Optimization

Convex Optimization
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Unsupervised and semi-supervised multi-class support vector machines

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2

Maximum margin clustering made practical

Proceedings of the 24th international conference on Machine learning
Transductive support vector machines for structured variables

Proceedings of the 24th international conference on Machine learning
Cost-sensitive learning with conditional Markov networks

Data Mining and Knowledge Discovery
Maximum margin clustering made practical

IEEE Transactions on Neural Networks
Maximum Entropy Discrimination Markov Networks

The Journal of Machine Learning Research
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Incorporating the loss function into discriminative clustering of structured outputs

IEEE Transactions on Neural Networks
Learning from partially annotated sequences

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Batch Mode Active Learning for Networked Data

ACM Transactions on Intelligent Systems and Technology (TIST)
Regularized bundle methods for convex and non-convex risks

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training.