Crowdsourcing user studies with Mechanical Turk
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Crowdsourcing for relevance evaluation
ACM SIGIR Forum
Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business
Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Are your participants gaming the system?: screening mechanical turk workers
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Who are the crowdworkers?: shifting demographics in mechanical turk
CHI '10 Extended Abstracts on Human Factors in Computing Systems
The effect of assessor error on IR system evaluation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Crowdsourcing document relevance assessment with Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Analyzing the Amazon Mechanical Turk marketplace
XRDS: Crossroads, The ACM Magazine for Students - Comp-YOU-Ter
Design and implementation of relevance assessments using crowdsourcing
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
In search of quality in crowdsourcing for search engine evaluation
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Overview of the INEX 2010 book track: scaling up the evaluation using crowdsourcing
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Crowdsourcing for information retrieval: introduction to the special issue
Information Retrieval
Increasing cheat robustness of crowdsourcing tasks
Information Retrieval
An analysis of human factors and label accuracy in crowdsourcing relevance judgments
Information Retrieval
Hi-index | 0.01 |
Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but suffer from varied output quality from crowd workers who possess different levels of competence and aspiration. This raises new challenges for quality control and requires an in-depth understanding of how workers' characteristics relate to the quality of their work. In this paper, we use behavioral observations (HIT completion time, fraction of useful labels, label accuracy) to define five worker types: Spammer, Sloppy, Incompetent, Competent, Diligent. Using data collected from workers engaged in the crowdsourced evaluation of the INEX 2010 Book Track Prove It task, we relate the worker types to label accuracy and personality trait information along the `Big Five' personality dimensions. We expect that these new insights about the types of crowd workers and the quality of their work will inform how to design HITs to attract the best workers to a task and explain why certain HIT designs are more effective than others.