Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Open Mind Common Sense: Knowledge Acquisition from the General Public
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Learner: a system for acquiring commonsense knowledge by analogy
Proceedings of the 2nd international conference on Knowledge capture
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Labeling images with a computer game
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The Proposition Bank: An Annotated Corpus of Semantic Roles
Computational Linguistics
Internet-scale collection of human-reviewed data
Proceedings of the 16th international conference on World Wide Web
Disambiguating for the web: a test of two methods
Proceedings of the 4th international conference on Knowledge capture
Crowdsourcing user studies with Mechanical Turk
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Proceedings of the international conference on Multimedia information retrieval
Sellers' problems in human computation markets
Proceedings of the ACM SIGKDD Workshop on Human Computation
Consistency in physical and on-screen action improves perceptions of telepresence robots
HRI '12 Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction
Perspectives on crowdsourcing annotations for natural language processing
Language Resources and Evaluation
Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation
ACM Transactions on Interactive Intelligent Systems (TiiS) - Special section on internet-scale human problem solving and regular papers
Hi-index | 0.00 |
Being expensive and time consuming, human knowledge acquisition has consistently been a major bottleneck for solving real problems. In this paper, we present a practical framework for acquiring high quality non-expert knowledge from on-demand workforce using Amazon Mechanical Turk (MTurk). We show how to apply this framework to collect large-scale human knowledge on AOL query classification in a fast and efficient fashion. Based on extensive experiments and analysis, we demonstrate how to detect low-quality labels from massive data sets and their impact on collecting high-quality knowledge. Our experimental findings also provide insight into the best practices on balancing cost and data quality for using MTurk.