Efficient computation of entropy gradient for semi-supervised conditional random fields

Authors:
Gideon S. Mann;Andrew McCallum
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Year:
2007

Citing 4
Cited 5

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised conditional random fields for improved sequence segmentation and labeling

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
MMR-based active machine learning for bio named entity recognition

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Efficient computation of the hidden Markov model entropy for a given observation sequence

IEEE Transactions on Information Theory

An analysis of active learning strategies for sequence labeling tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning for part-of-speech tagging: accelerating corpus annotation

LAW '07 Proceedings of the Linguistic Annotation Workshop
Aspects of semi-supervised and active learning in conditional random fields

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Mining query structure from click data: a case study of product queries

Proceedings of the 20th ACM international conference on Information and knowledge management
A weakly-supervised approach to argumentative zoning of scientific documents

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Entropy regularization is a straightforward and successful method of semi-supervised learning that augments the traditional conditional likelihood objective function with an additional term that aims to minimize the predicted label entropy on unlabeled data. It has previously been demonstrated to provide positive results in linear-chain CRFs, but the published method for calculating the entropy gradient requires significantly more computation than supervised CRF training. This paper presents a new derivation and dynamic program for calculating the entropy gradient that is significantly more efficient---having the same asymptotic time complexity as supervised CRF training. We also present efficient generalizations of this method for calculating the label entropy of all sub-sequences, which is useful for active learning, among other applications.