Instrumenting the crowd: using implicit behavioral measures to predict task performance

Authors:
Jeffrey M. Rzeszotarski;Aniket Kittur
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 24th annual ACM symposium on User interface software and technology
Year:
2011

Citing 18
Cited 13

Extracting usability information from user interface events

ACM Computing Surveys (CSUR)
The state of the art in automating usability evaluation of user interfaces

ACM Computing Surveys (CSUR)
Workflow mining: a survey of issues and approaches

Data & Knowledge Engineering
Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Tracking real-time user experience (TRUE): a comprehensive instrumentation solution for complex systems

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Crowdsourcing user studies with Mechanical Turk

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatically detecting pointing performance

Proceedings of the 13th international conference on Intelligent user interfaces
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Mining problem-solving strategies from HCI data

ACM Transactions on Computer-Human Interaction (TOCHI)
Are your participants gaming the system?: screening mechanical turk workers

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Financial incentives and the "performance of crowds"

ACM SIGKDD Explorations Newsletter
Automatic detection of users' skill levels using high-frequency user interface events

User Modeling and User-Adapted Interaction
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Toward automatic task design: a progress report

Proceedings of the ACM SIGKDD Workshop on Human Computation
What are participants doing while filling in an online questionnaire: A paradata collection tool and an empirical study

Computers in Human Behavior
Soylent: a word processor with a crowd inside

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Unlocking the expressivity of point lights

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Accurate measurements of pointing performance from in situ observations

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A crowdsourcing quality control model for tasks distributed in parallel

CHI '12 Extended Abstracts on Human Factors in Computing Systems
CrowdScape: interactively visualizing user behavior and output

Proceedings of the 25th annual ACM symposium on User interface software and technology
The future of crowd work

Proceedings of the 2013 conference on Computer supported cooperative work
How to filter out random clickers in a crowdsourcing-based study?

Proceedings of the 2012 BELIV Workshop: Beyond Time and Errors - Novel Evaluation Methods for Visualization
An analysis of human factors and label accuracy in crowdsourcing relevance judgments

Information Retrieval
Experiences surveying the crowd: reflections on methods, participation, and reliability

Proceedings of the 5th Annual ACM Web Science Conference
Crowdsourcing performance evaluations of user interfaces

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Using behavioral data to identify interviewer fabrication in surveys

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The motivations and experiences of the on-demand mobile workforce

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Trust, but verify: predicting contribution quality for knowledge base construction and curation

Proceedings of the 7th ACM international conference on Web search and data mining
Evaluation in Music Information Retrieval

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Detecting and correcting low quality submissions in crowdsourcing tasks is an important challenge. Prior work has primarily focused on worker outcomes or reputation, using approaches such as agreement across workers or with a gold standard to evaluate quality. We propose an alternative and complementary technique that focuses on the way workers work rather than the products they produce. Our technique captures behavioral traces from online crowd workers and uses them to predict outcome measures such quality, errors, and the likelihood of cheating. We evaluate the effectiveness of the approach across three contexts including classification, generation, and comprehension tasks. The results indicate that we can build predictive models of task performance based on behavioral traces alone, and that these models generalize to related tasks. Finally, we discuss limitations and extensions of the approach.