The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy

Authors:
Gabriella Kazai;Jaap Kamps;Natasa Milic-Frayling
Affiliations:
Microsoft Research, Cambridge, United Kingdom;University of Amsterdam, Amsterdam, Netherlands;Microsoft Research, Cambridge, United Kingdom
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 13
Cited 2

Crowdsourcing user studies with Mechanical Turk

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Relevance assessment: are judges exchangeable and does it matter

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Crowdsourcing for relevance evaluation

ACM SIGIR Forum
Financial incentives and the "performance of crowds"

Proceedings of the ACM SIGKDD Workshop on Human Computation
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Are your participants gaming the system?: screening mechanical turk workers

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Who are the crowdworkers?: shifting demographics in mechanical turk

CHI '10 Extended Abstracts on Human Factors in Computing Systems
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Crowdsourcing document relevance assessment with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Analyzing the Amazon Mechanical Turk marketplace

XRDS: Crossroads, The ACM Magazine for Students - Comp-YOU-Ter
Designing incentives for inexpert human raters

Proceedings of the ACM 2011 conference on Computer supported cooperative work
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Crowdsourcing for information retrieval: introduction to the special issue

Information Retrieval
The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information retrieval systems require human contributed relevance labels for their training and evaluation. Increasingly such labels are collected under the anonymous, uncontrolled conditions of crowdsourcing, leading to varied output quality. While a range of quality assurance and control techniques have now been developed to reduce noise during or after task completion, little is known about the workers themselves and possible relationships between workers' characteristics and the quality of their work. In this paper, we ask how do the relatively well or poorly-performing crowds, working under specific task conditions, actually look like in terms of worker characteristics, such as demographics or personality traits. Our findings show that the face of a crowd is in fact indicative of the quality of their work.