Towards anytime active learning: interrupting experts to reduce annotation costs

Authors:
Maria E. Ramirez-Loaiza;Aron Culotta;Mustafa Bilgic
Affiliations:
Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL
Venue:
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Year:
2013

Citing 12
Cited 0

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Learning to classify text from labeled and unlabeled documents

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Economical active feature-value acquisition through Expected Utility estimation

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Proactive learning: cost-sensitive active learning with multiple imperfect oracles

Proceedings of the 17th ACM conference on Information and knowledge management
Selective supervision: guiding supervised learning with decision-theoretic active learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Active Learning from Multiple Noisy Labelers with Varied Costs

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Learning word vectors for sentiment analysis

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Value of information lattice: exploiting probabilistic independence for effective feature subset acquisition

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many active learning methods use annotation cost or expert quality as part of their framework to select the best data for annotation. While these methods model expert quality, availability, or expertise, they have no direct influence on any of these elements. We present a novel framework built upon decision-theoretic active learning that allows the learner to directly control label quality by allocating a time budget to each annotation. We show that our method is able to improve performance efficiency of the active learner through an interruption mechanism trading off the induced error with the cost of annotation. Our simulation experiments on three document classification tasks show that some interruption is almost always better than none, but that the optimal interruption time varies by dataset.