Data quality from crowdsourcing: a study of annotation selection criteria

Authors:
Pei-Yun Hsueh;Prem Melville;Vikas Sindhwani
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Year:
2009

Citing 16
Cited 22

Correlations and Copulas for Decision and Risk Analysis

Management Science
Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning Subjective Language

Computational Linguistics
Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Internet-scale collection of human-reviewed data

Proceedings of the 16th international conference on World Wide Web
Crowdsourcing user studies with Mechanical Turk

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business

Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business
Document-Word Co-regularization for Semi-supervised Sentiment Analysis

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Sentiment analysis of blogs by combining lexical knowledge with text classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extractive vs. NLG-based abstractive summarization of evaluative text: the effect of corpus controversiality

INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
Creating subjective and objective sentence classifiers from unannotated texts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation

Proceedings of the international conference on Multimedia information retrieval
Crowdsourcing the assembly of concept hierarchies

Proceedings of the 10th annual joint conference on Digital libraries
Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
VizWiz: nearly real-time answers to visual questions

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Creating speech and language data with Amazon's Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Crowdsourcing and language studies: the new generation of linguistic data

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Amazon Mechanical Turk for subjectivity word sense disambiguation

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
A crowdsourcing based mobile image translation and knowledge sharing service

Proceedings of the 9th International Conference on Mobile and Ubiquitous Multimedia
Quantifying QoS requirements of network services: a cheat-proof framework

MMSys '11 Proceedings of the second annual ACM conference on Multimedia systems
Reducing the need for double annotation

LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
Guess what? a game for affective annotation of video using crowd sourcing

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
Crowdsourcing syntactic relatedness judgements for opinion mining in the study of information technology adoption

LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Crowd IQ: measuring the intelligence of crowdsourcing platforms

Proceedings of the 3rd Annual ACM Web Science Conference
Experimenting with distant supervision for emotion classification

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Grammatical structures for word-level sentiment detection

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface

Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia
A prototype tool set to support machine-assisted annotation

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Quality control mechanisms for crowdsourcing: peer review, arbitration, & expertise at familysearch indexing

Proceedings of the 2013 conference on Computer supported cooperative work
Increasing cheat robustness of crowdsourcing tasks

Information Retrieval
Tagging human activities in video by crowdsourcing

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization

Proceedings of the 19th international conference on Intelligent User Interfaces
Automatic annotation of image databases based on implicit crowdsourcing, visual concept modeling and evolution

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Annotation acquisition is an essential step in training supervised classifiers. However, manual annotation is often time-consuming and expensive. The possibility of recruiting annotators through Internet services (e.g., Amazon Mechanic Turk) is an appealing option that allows multiple labeling tasks to be outsourced in bulk, typically with low overall costs and fast completion rates. In this paper, we consider the difficult problem of classifying sentiment in political blog snippets. Annotation data from both expert annotators in a research lab and non-expert annotators recruited from the Internet are examined. Three selection criteria are identified to select high-quality annotations: noise level, sentiment ambiguity, and lexical uncertainty. Analysis confirm the utility of these criteria on improving data quality. We conduct an empirical study to examine the effect of noisy annotations on the performance of sentiment classification models, and evaluate the utility of annotation selection on classification accuracy and efficiency.