Toward interactive training and evaluation

Authors:
Gregory Druck;Andrew McCallum
Affiliations:
University of Massachusetts, Amherst, MA, USA;University of Massachusetts, Amherst, MA, USA
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 12
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Text clustering with extended user feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Learning from labeled features using generalized expectation criteria

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A simple and efficient sampling method for estimating AP and NDCG

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning from measurements in exponential families

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Interactive feature space construction using semantic information

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Corrective feedback and persistent learning for information extraction

Artificial Intelligence
Active learning by labeling features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Posterior Regularization for Structured Latent Variable Models

The Journal of Machine Learning Research
Online stratified sampling: evaluating classifiers at web-scale

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Closing the loop: fast, interactive semi-supervised annotation with queries on features and instances

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generalized expectation criteria for lightly supervised learning

Generalized expectation criteria for lightly supervised learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the development of methods for leveraging abundant prior knowledge about these problems, including methods for lightly supervised learning using model expectation constraints. Building on this work, we envision an interactive training paradigm in which practitioners perform evaluation, analyze errors, and provide and refine expectation constraints in a closed loop. In this paper, we focus on several key subproblems in this paradigm that can be cast as selecting a representative sample of the unlabeled data for the practitioner to inspect. To address these problems, we propose stratified sampling methods that use model expectations as a proxy for latent output variables. In classification and sequence labeling experiments, these sampling strategies reduce accuracy evaluation effort by as much as 53%, provide more reliable estimates of $F_1$ for rare labels, and aid in the specification and refinement of constraints.