Acquiring high quality non-expert knowledge from on-demand workforce

  • Authors:
  • Donghui Feng;Sveva Besana;Remi Zajac

  • Affiliations:
  • AT&T Interactive Research, Glendale, CA;AT&T Interactive Research, Glendale, CA;AT&T Interactive Research, Glendale, CA

  • Venue:
  • People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Being expensive and time consuming, human knowledge acquisition has consistently been a major bottleneck for solving real problems. In this paper, we present a practical framework for acquiring high quality non-expert knowledge from on-demand workforce using Amazon Mechanical Turk (MTurk). We show how to apply this framework to collect large-scale human knowledge on AOL query classification in a fast and efficient fashion. Based on extensive experiments and analysis, we demonstrate how to detect low-quality labels from massive data sets and their impact on collecting high-quality knowledge. Our experimental findings also provide insight into the best practices on balancing cost and data quality for using MTurk.