Crowdsourcing user studies with Mechanical Turk
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TurKit: tools for iterative tasks on mechanical Turk
Proceedings of the ACM SIGKDD Workshop on Human Computation
The effect of linguistic devices in information presentation messages on comprehension and recall
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The lie detector: explorations in the automatic recognition of deceptive language
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Feasibility of human-in-the-loop minimum error rate training
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Creating speech and language data with Amazon's Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Cross-caption coreference resolution for automatic image understanding
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Collecting highly parallel data for paraphrase evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
CDAS: a crowdsourcing data analytics system
Proceedings of the VLDB Endowment
Midge: generating image descriptions from computer vision detections
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Natural language descriptions of visual scenes: corpus generation and analysis
EACL 2012 Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface
Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia
From image annotation to image description
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Leveraging the crowd to improve feature-sentiment analysis of user reviews
Proceedings of the 2013 international conference on Intelligent user interfaces
An introduction to crowdsourcing for language and multimedia technology research
PROMISE'12 Proceedings of the 2012 international conference on Information Retrieval Meets Information Visualization
Assessing internet video quality using crowdsourcing
Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia
Proceedings of the 19th international conference on Intelligent User Interfaces
Framing image description as a ranking task: data, models and evaluation metrics
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Crowd-sourcing approaches such as Amazon's Mechanical Turk (MTurk) make it possible to annotate or collect large amounts of linguistic data at a relatively low cost and high speed. However, MTurk offers only limited control over who is allowed to particpate in a particular task. This is particularly problematic for tasks requiring free-form text entry. Unlike multiple-choice tasks there is no correct answer, and therefore control items for which the correct answer is known cannot be used. Furthermore, MTurk has no effective built-in mechanism to guarantee workers are proficient English writers. We describe our experience in creating corpora of images annotated with multiple one-sentence descriptions on MTurk and explore the effectiveness of different quality control strategies for collecting linguistic data using Mechanical MTurk. We find that the use of a qualification test provides the highest improvement of quality, whereas refining the annotations through follow-up tasks works rather poorly. Using our best setup, we construct two image corpora, totaling more than 40,000 descriptive captions for 9000 images.