Eliminating spammers and ranking annotators for crowdsourced labeling tasks

Authors:
Vikas C. Raykar;Shipeng Yu
Affiliations:
Siemens Healthcare, Malvern, PA;Siemens Healthcare, Malvern, PA
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 11
Cited 6

A view of the EM algorithm that justifies incremental, sparse, and other variants

Proceedings of the NATO Advanced Study Institute on Learning in graphical models
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised learning from multiple experts: whom to trust when everyone lies a bit

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Efficiently learning the accuracy of labeling sources for selective sampling

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Learning From Crowds

The Journal of Machine Learning Research
Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Clustering dictionary definitions using Amazon Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk

Efficient crowdsourcing for multi-class labeling

Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Cross-task crowdsourcing

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating the crowd with confidence

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Aggregating crowdsourced binary ratings

Proceedings of the 22nd international conference on World Wide Web
Reconciliation of categorical opinions from multiple sources

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Mixtures of biased sentiment analysers

Advances in Data Analysis and Classification

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the advent of crowdsourcing services it has become quite cheap and reasonably effective to get a data set labeled by multiple annotators in a short amount of time. Various methods have been proposed to estimate the consensus labels by correcting for the bias of annotators with different kinds of expertise. Since we do not have control over the quality of the annotators, very often the annotations can be dominated by spammers, defined as annotators who assign labels randomly without actually looking at the instance. Spammers can make the cost of acquiring labels very expensive and can potentially degrade the quality of the final consensus labels. In this paper we propose an empirical Bayesian algorithm called SpEMthat iteratively eliminates the spammers and estimates the consensus labels based only on the good annotators. The algorithm is motivated by defining a spammer score that can be used to rank the annotators. Experiments on simulated and real data show that the proposed approach is better than (or as good as) the earlier approaches in terms of the accuracy and uses a significantly smaller number of annotators.