Parallel active learning: eliminating wait time with minimal staleness

Authors:
Robbie Haertel;Paul Felt;Eric Ringger;Kevin Seppi
Affiliations:
Brigham Young University, Provo, Utah;Brigham Young University, Provo, Utah;Brigham Young University, Provo, Utah;Brigham Young University, Provo, Utah
Venue:
ALNLP '10 Proceedings of the NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing
Year:
2010

Citing 4
Cited 1

Principles and applications of continual computation

Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Minimizing manual annotation cost in supervised training from corpora

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Estimating annotation cost for active learning in a multi-annotator environment

HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing

Active learning with Amazon Mechanical Turk

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A practical concern for Active Learning (AL) is the amount of time human experts must wait for the next instance to label. We propose a method for eliminating this wait time independent of specific learning and scoring algorithms by making scores always available for all instances, using old (stale) scores when necessary. The time during which the expert is annotating is used to train models and score instances--in parallel--to maximize the recency of the scores. Our method can be seen as a parameterless, dynamic batch AL algorithm. We analyze the amount of staleness introduced by various AL schemes and then examine the effect of the staleness on performance on a part-of-speech tagging task on the Wall Street Journal. Empirically, the parallel AL algorithm effectively has a batch size of one and a large candidate set size but eliminates the time an annotator would have to wait for a similarly parameterized batch scheme to select instances. The exact performance of our method on other tasks will depend on the relative ratios of time spent annotating, training, and scoring, but in general we expect our parameterless method to perform favorably compared to batch when accounting for wait time.