Minimum-risk training of approximate CRF-based NLP systems

Authors:
Veselin Stoyanov;Jason Eisner
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 26
Cited 1

Machine Learning for Information Extraction in Informal Domains

Machine Learning - Special issue on information retrieval
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An Alternate Objective Function for Markovian Fields

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Collective multi-label classification

Proceedings of the 14th ACM international conference on Information and knowledge management
Information extraction from research papers using conditional random fields

Information Processing and Management: an International Journal
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Training conditional random fields with multivariate evaluation measures

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Identifying sources of opinions with conditional random fields and extraction patterns

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting

The Journal of Machine Learning Research
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data

The Journal of Machine Learning Research
Minimum risk annealing for training log-linear models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A learning algorithm for continually running fully recurrent neural networks

Neural Computation
Training structural SVMs when exact inference is intractable

Proceedings of the 25th international conference on Machine learning
Get out the vote: determining support or opposition from congressional floor-debate transcripts

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Dependency parsing by belief propagation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
First- and second-order expectation semirings with applications to minimum-risk training on translation forests

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Graphical models over multiple strings

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Softmax-margin CRFs: training log-linear models with cost functions

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Event discovery in social media feeds

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Knowledge base population: successful approaches and challenges

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Loopy belief propagation for approximate inference: an empirical study

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Parameter learning with truncated message-passing

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conditional Random Fields (CRFs) are a popular formalism for structured prediction in NLP. It is well known how to train CRFs with certain topologies that admit exact inference, such as linear-chain CRFs. Some NLP phenomena, however, suggest CRFs with more complex topologies. Should such models be used, considering that they make exact inference intractable? Stoyanov et al. (2011) recently argued for training parameters to minimize the task-specific loss of whatever approximate inference and decoding methods will be used at test time. We apply their method to three NLP problems, showing that (i) using more complex CRFs leads to improved performance, and that (ii) minimum-risk training learns more accurate models.