Active learning with Amazon Mechanical Turk

Authors:
Florian Laws;Christian Scheible;Hinrich Schütze
Affiliations:
Universität Stuttgart;Universität Stuttgart;Universität Stuttgart
Venue:
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2011

Citing 22
Cited 5

A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Evaluating message understanding systems: an analysis of the third message understanding conference (MUC-3)

Computational Linguistics
Optimization, maxent models, and conditional estimation without magic

NAACL-Tutorials '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Tutorials - Volume 5
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Active learning for logistic regression: an evaluation

Machine Learning
Proactive learning: cost-sensitive active learning with multiple imperfect oracles

Proceedings of the 17th ACM conference on Information and knowledge management
Efficiently learning the accuracy of labeling sources for selective sampling

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning for part-of-speech tagging: accelerating corpus annotation

LAW '07 Proceedings of the Linguistic Annotation Workshop
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
How well does active learning actually work?: Time-based evaluation of cost-reduction strategies for language documentation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Investigating the effects of selective sampling on the annotation task

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Parallel active learning: eliminating wait time with minimal staleness

ALNLP '10 Proceedings of the NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing
Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Annotating large email datasets for named entity recognition with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Annotating named entities in Twitter data with crowdsourcing

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A hybrid model for annotating named entity training corpora

LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Bringing active learning to life

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics

Crowdsourcing research opportunities: lessons from natural language processing

Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
Learning multilingual named entity recognition from Wikipedia

Artificial Intelligence
Implementing crowdsourcing-based relevance experimentation: an industrial perspective

Information Retrieval
Games with a Purpose or Mechanised Labour?: A Comparative Study

Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
Crowdsourced Knowledge Acquisition: Towards Hybrid-Genre Workflows

International Journal on Semantic Web & Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Supervised classification needs large amounts of annotated training data that is expensive to create. Two approaches that reduce the cost of annotation are active learning and crowdsourcing. However, these two approaches have not been combined successfully to date. We evaluate the utility of active learning in crowdsourcing on two tasks, named entity recognition and sentiment detection, and show that active learning outperforms random selection of annotation examples in a noisy crowdsourcing scenario.