Reducing labeling effort for structured prediction tasks

Authors:
Aron Culotta;Andrew McCallum
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Year:
2005

Citing 7
Cited 17

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active Learning for Natural Language Parsing and Information Extraction

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Active Hidden Markov Models for Information Extraction

IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Proposal for an Interactive Environment for Information Extraction

Proposal for an Interactive Environment for Information Extraction
Confidence estimation for translation prediction

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Interactive information extraction with constrained conditional random fields

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence

Confidence estimation for NLP applications

ACM Transactions on Speech and Language Processing (TSLP)
Improving Relation Extraction by Exploiting Properties of the Target Relation

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Accelerating the annotation of sparse named entities by dynamic sentence selection

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
An analysis of active learning strategies for sequence labeling tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning for pipeline models

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Corrective feedback and persistent learning for information extraction

Artificial Intelligence
Reducing the annotation effort for letter-to-phoneme conversion

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Confidence-based stopping criteria for active learning for data annotation

ACM Transactions on Speech and Language Processing (TSLP)
Active learning for sequence labelling with probability re-estimation

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Margin-Based active learning for structured output spaces

ECML'06 Proceedings of the 17th European conference on Machine Learning
Uncertainty-based active learning with instability estimation for text classification

ACM Transactions on Speech and Language Processing (TSLP)
An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity

The Journal of Machine Learning Research
Actively soliciting feedback for query answers in keyword search-based data integration

Proceedings of the VLDB Endowment
Active learning with multi-label SVM classification

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Active learning from relative queries

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Autonomously learning to visually detect where manipulation will succeed

Autonomous Robots
Active Rare Class Discovery and Classification Using Dirichlet Processes

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

A common obstacle preventing the rapid deployment of supervised machine learning algorithms is the lack of labeled training data. This is particularly expensive to obtain for structured prediction tasks, where each training instance may have multiple, interacting labels, all of which must be correctly annotated for the instance to be of use to the learner. Traditional active learning addresses this problem by optimizing the order in which the examples are labeled to increase learning efficiency. However, this approach does not consider the difficulty of labeling each example, which can vary widely in structured prediction tasks. For example, the labeling predicted by a partially trained system may be easier to correct for some instances than for others. We propose a new active learning paradigm which reduces not only how many instances the annotator must label, but also how difficult each instance is to annotate. The system also leverages information from partially correct predictions to efficiently solicit annotations from the user. We validate this active learning framework in an interactive information extraction system, reducing the total number of annotation actions by 22%.