Implementing crowdsourcing-based relevance experimentation: an industrial perspective

  • Authors:
  • Omar Alonso

  • Affiliations:
  • Microsoft Corp., Mountain View, USA

  • Venue:
  • Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Crowdsourcing has emerged as a viable platform for conducting different types of relevance evaluation. The main reason behind this trend is that it makes possible to conduct experiments extremely fast, with good results at a low cost. However, like in any experiment, there are several implementation details that would make an experiment work or fail. To gather useful results, clear instructions, user interface guidelines, content quality, inter-rater agreement metrics, work quality, and worker feedback are important characteristics of a successful crowdsourcing experiment. Furthermore, designing and implementing experiments that require thousands or millions of labels is different than conducting small scale research investigations. In this paper we outline a framework for conducting continuous crowdsourcing experiments, emphasizing aspects that should be of importance for all sorts of tasks. We illustrate the value of characteristics that can impact the overall outcome using examples based on TREC, INEX, and Wikipedia data sets.