The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing

Authors:
Hongwei Li;Bo Zhao;Ariel Fuxman
Affiliations:
Department of Statistics, UC Berkeley, Berkeley, CA, USA;Microsoft Research, Silicon Valley, Mountain View, CA, USA;Microsoft Research, Silicon Valley, Mountain View, CA, USA
Venue:
Proceedings of the 23rd international conference on World wide web
Year:
2014

Citing 7
Cited 0

Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Who are the crowdworkers?: shifting demographics in mechanical turk

CHI '10 Extended Abstracts on Human Factors in Computing Systems
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Learning From Crowds

The Journal of Machine Learning Research
Analyzing the Amazon Mechanical Turk marketplace

XRDS: Crossroads, The ACM Magazine for Students - Comp-YOU-Ter
The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy

Proceedings of the 21st ACM international conference on Information and knowledge management
An analysis of human factors and label accuracy in crowdsourcing relevance judgments

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Worker reliability is a longstanding issue in crowdsourcing, and the automatic discovery of high quality workers is an important practical problem. Most previous work on this problem mainly focuses on estimating the quality of each individual worker jointly with the true answer of each task. However, in practice, for some tasks, worker quality could be associated with some explicit characteristics of the worker, such as education level, major and age. So the following question arises: how do we automatically discover related worker attributes for a given task, and further utilize the findings to improve data quality? In this paper, we propose a general crowd targeting framework that can automatically discover, for a given task, if any group of workers based on their attributes have higher quality on average; and target such groups, if they exist, for future work on the same task. Our crowd targeting framework is complementary to traditional worker quality estimation approaches. Furthermore, an advantage of our framework is that it is more budget efficient because we are able to target potentially good workers before they actually do the task. Experiments on real datasets show that the accuracy of final prediction can be improved significantly for the same budget (or even less budget in some cases). Our framework can be applied to many real word tasks and can be easily integrated in current crowdsourcing platforms.