Aggregating crowdsourced binary ratings

Authors:
Nilesh Dalvi;Anirban Dasgupta;Ravi Kumar;Vibhor Rastogi
Affiliations:
Facebook, Menlo Park, CA, USA;Yahoo! Labs, Sunnyvale, CA, USA;Google, Mountain View, CA, USA;Google, Mountain View, CA, USA
Venue:
Proceedings of the 22nd international conference on World Wide Web
Year:
2013

Citing 11
Cited 0

Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning From Crowds

The Journal of Machine Learning Research
Managing crowdsourced human computation: a tutorial

Proceedings of the 20th international conference companion on World wide web
The computer is the new sewing machine: benefits and perils of crowdsourcing

Proceedings of the 20th international conference companion on World wide web
Who moderates the moderators?: crowdsourcing abuse detection in user-generated content

Proceedings of the 12th ACM conference on Electronic commerce
Crowdsourcing with endogenous entry

Proceedings of the 21st international conference on World Wide Web
Eliminating spammers and ranking annotators for crowdsourced labeling tasks

The Journal of Machine Learning Research
Low-Rank Matrix Approximation with Weights or Missing Data Is NP-Hard

SIAM Journal on Matrix Analysis and Applications
Incentives for truthful reporting in crowdsourcing

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
User-Friendly Tail Bounds for Sums of Random Matrices

Foundations of Computational Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we analyze a crowdsourcing system consisting of a set of users and a set of binary choice questions. Each user has an unknown, fixed, reliability that determines the user's error rate in answering questions. The problem is to determine the truth values of the questions solely based on the user answers. Although this problem has been studied extensively, theoretical error bounds have been shown only for restricted settings: when the graph between users and questions is either random or complete. In this paper we consider a general setting of the problem where the user--question graph can be arbitrary. We obtain bounds on the error rate of our algorithm and show it is governed by the expansion of the graph. We demonstrate, using several synthetic and real datasets, that our algorithm outperforms the state of the art.