Extracting usability information from user interface events
ACM Computing Surveys (CSUR)
The state of the art in automating usability evaluation of user interfaces
ACM Computing Surveys (CSUR)
Workflow mining: a survey of issues and approaches
Data & Knowledge Engineering
Labeling images with a computer game
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Crowdsourcing user studies with Mechanical Turk
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatically detecting pointing performance
Proceedings of the 13th international conference on Intelligent user interfaces
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Mining problem-solving strategies from HCI data
ACM Transactions on Computer-Human Interaction (TOCHI)
Are your participants gaming the system?: screening mechanical turk workers
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Financial incentives and the "performance of crowds"
ACM SIGKDD Explorations Newsletter
Automatic detection of users' skill levels using high-frequency user interface events
User Modeling and User-Adapted Interaction
Quality management on Amazon Mechanical Turk
Proceedings of the ACM SIGKDD Workshop on Human Computation
Toward automatic task design: a progress report
Proceedings of the ACM SIGKDD Workshop on Human Computation
Soylent: a word processor with a crowd inside
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Unlocking the expressivity of point lights
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Accurate measurements of pointing performance from in situ observations
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A crowdsourcing quality control model for tasks distributed in parallel
CHI '12 Extended Abstracts on Human Factors in Computing Systems
CrowdScape: interactively visualizing user behavior and output
Proceedings of the 25th annual ACM symposium on User interface software and technology
Proceedings of the 2013 conference on Computer supported cooperative work
How to filter out random clickers in a crowdsourcing-based study?
Proceedings of the 2012 BELIV Workshop: Beyond Time and Errors - Novel Evaluation Methods for Visualization
An analysis of human factors and label accuracy in crowdsourcing relevance judgments
Information Retrieval
Experiences surveying the crowd: reflections on methods, participation, and reliability
Proceedings of the 5th Annual ACM Web Science Conference
Crowdsourcing performance evaluations of user interfaces
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Using behavioral data to identify interviewer fabrication in surveys
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The motivations and experiences of the on-demand mobile workforce
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Trust, but verify: predicting contribution quality for knowledge base construction and curation
Proceedings of the 7th ACM international conference on Web search and data mining
Evaluation in Music Information Retrieval
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Detecting and correcting low quality submissions in crowdsourcing tasks is an important challenge. Prior work has primarily focused on worker outcomes or reputation, using approaches such as agreement across workers or with a gold standard to evaluate quality. We propose an alternative and complementary technique that focuses on the way workers work rather than the products they produce. Our technique captures behavioral traces from online crowd workers and uses them to predict outcome measures such quality, errors, and the likelihood of cheating. We evaluate the effectiveness of the approach across three contexts including classification, generation, and comprehension tasks. The results indicate that we can build predictive models of task performance based on behavioral traces alone, and that these models generalize to related tasks. Finally, we discuss limitations and extensions of the approach.