Using behavioral data to identify interviewer fabrication in surveys

Authors:
Benjamin Birnbaum;Gaetano Borriello;Abraham D. Flaxman;Brian DeRenzi;Anna R. Karlin
Affiliations:
University of Washington, Seattle, WA, USA;University of Washington, Seattle, Washington, USA;University of Washington, Seattle, Washington, USA;University of Washington, Seattle, Washington, USA;University of Washington, Seattle, Washington, USA
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2013

Citing 14
Cited 1

Usability evaluation of computer-assisted survey instruments

Social Science Computer Review - Special issue on survey and statistical computing in the new millennium
Extracting usability information from user interface events

ACM Computing Surveys (CSUR)
Random Forests

Machine Learning
Mobile phones and paper documents: evaluating a new approach for capturing microfinance data in rural India

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatically detecting pointing performance

Proceedings of the 13th international conference on Intelligent user interfaces
An empirical evaluation of supervised learning in high dimensions

Proceedings of the 25th international conference on Machine learning
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Automatic detection of users' skill levels using high-frequency user interface events

User Modeling and User-Adapted Interaction
What are participants doing while filling in an online questionnaire: A paradata collection tool and an empirical study

Computers in Human Behavior
Usher: Improving Data Quality with Dynamic Forms

IEEE Transactions on Knowledge and Data Engineering
Instrumenting the crowd: using implicit behavioral measures to predict task performance

Proceedings of the 24th annual ACM symposium on User interface software and technology
Automated quality control for mobile data collection

Proceedings of the 2nd ACM Symposium on Computing for Development
Open data kit: tools to build information services for developing regions

Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development
Managing microfinance with paper, pen and digital slate

Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development

Towards operationalizing outlier detection in community health programs

Proceedings of the Sixth International Conference on Information and Communications Technologies and Development: Notes - Volume 2

Quantified Score

Hi-index	0.01

Visualization

Abstract

Surveys conducted by human interviewers are one of the principal means of gathering data from all over the world, but the quality of this data can be threatened by interviewer fabrication. In this paper, we investigate a new approach to detecting interviewer fabrication automatically. We instrument electronic data collection software to record logs of low-level behavioral data and show that supervised classification, when applied to features extracted from these logs, can identify interviewer fabrication with an accuracy of up to 96%. We show that even when interviewers know that our approach is being used, have some knowledge of how it works, and are incentivized to avoid detection, it can still achieve an accuracy of 86%. We also demonstrate the robustness of our approach to a moderate amount of label noise and provide practical recommendations, based on empirical evidence, on how much data is needed for our approach to be effective.