Leveraging non-expert crowdsourcing workers for improper task detection in crowdsourcing marketplaces

  • Authors:
  • Yukino Baba;Hisashi Kashima;Kei Kinoshita;Goushi Yamaguchi;Yosuke Akiyoshi

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2014

Quantified Score

Hi-index 12.05

Visualization

Abstract

Controlling the quality of tasks, i.e., propriety of posted jobs, is a major challenge in crowdsourcing marketplaces. Most existing crowdsourcing services prohibit requesters from posting illegal or objectionable tasks. Operators in marketplaces have to monitor tasks continuously to find such improper ones; however, it is very expensive to manually investigate each task. In this paper, we present the results of our trial study on automatic detection of improper tasks to support the monitoring of activities by marketplace operators. We performed experiments using real task data from a commercial crowdsourcing marketplace and showed that the classifier trained by the operators' judgments achieves a high performance in detecting improper tasks. By analyzing the estimated classifier, we observed several effective features for detecting improper tasks, such as the words appeared in the task information, the amount of money that each worker will receive for the task, and the type of worker qualification option set for a task. In addition, to reduce the annotation costs of the operators and improve classification performance, we considered the use of crowdsourcing for task annotation. We hired a group of crowdsourcing (non-expert) workers to monitor posted tasks and use their judgments to train the classifier. We were able to confirm that applying quality control techniques is beneficial for handling the variability in worker reliability and that it improved the performance of the classifier. Finally, our results showed that the use of non-expert judgments of crowdsourcing workers in combination with expert judgments improves the performance of detecting improper crowdsourcing tasks, and that the use of crowdsourced labels allows a reduction in the required number of expert judgments by 25% while maintaining the level of detection performance.