Automated quality control for mobile data collection

Authors:
Benjamin Birnbaum;Brian DeRenzi;Abraham D. Flaxman;Neal Lesh
Affiliations:
University of Washington, Seattle;University of Washington, Seattle;University of Washington;Dimagi, Inc.
Venue:
Proceedings of the 2nd ACM Symposium on Computing for Development
Year:
2012

Citing 12
Cited 3

A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Random Forests

Machine Learning
Mobile phones and paper documents: evaluating a new approach for capturing microfinance data in rural India

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Evaluating the accuracy of data collection on mobile phones: a study of forms, sms, and voice

ICTD'09 Proceedings of the 3rd international conference on Information and communication technologies and development
Designing adaptive feedback for improving data entry accuracy

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Usher: Improving Data Quality with Dynamic Forms

IEEE Transactions on Knowledge and Data Engineering
Improving community health worker performance through automated SMS

Proceedings of the Fifth International Conference on Information and Communication Technologies and Development
Open data kit: tools to build information services for developing regions

Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development
Managing microfinance with paper, pen and digital slate

Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development

Improving data collection and monitoring through real-time data analysis

Proceedings of the 3rd ACM Symposium on Computing for Development
Using behavioral data to identify interviewer fabrication in surveys

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Towards operationalizing outlier detection in community health programs

Proceedings of the Sixth International Conference on Information and Communications Technologies and Development: Notes - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Systematic interviewer error is a potential issue in any health survey, and it can be especially pernicious in low- and middle-income countries, where survey teams may face problems of limited supervision, chaotic environments, language barriers, and low literacy. Survey teams in such environments could benefit from software that leverages mobile data collection tools to provide solutions for automated data quality control. As a first step in the creation of such software, we investigate and test several algorithms that find anomalous patterns in data. We validate the algorithms using one labeled data set and two unlabeled data sets from two community outreach programs in East Africa. In the labeled set, some of the data is known to be fabricated and some is believed to be relatively accurate. The unlabeled sets are from actual field operations. We demonstrate the feasibility of tools for automated data quality control by showing that the algorithms detect the fake data in the labeled set with a high sensitivity and specificity, and that they detect compelling anomalies in the unlabeled sets.