Opinion mining of Spanish customer comments with non-expert annotations on Mechanical Turk

  • Authors:
  • Bart Mellebeek;Francesc Benavent;Jens Grivolla;Joan Codina;Marta R. Costa-jussà;Rafael Banchs

  • Affiliations:
  • Barcelona Media Innovation Center, Barcelona, Spain;Barcelona Media Innovation Center, Barcelona, Spain;Barcelona Media Innovation Center, Barcelona, Spain;Barcelona Media Innovation Center, Barcelona, Spain;Barcelona Media Innovation Center, Barcelona, Spain;Barcelona Media Innovation Center, Barcelona, Spain

  • Venue:
  • CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the major bottlenecks in the development of data-driven AI Systems is the cost of reliable human annotations. The recent advent of several crowdsourcing platforms such as Amazon's Mechanical Turk, allowing requesters the access to affordable and rapid results of a global workforce, greatly facilitates the creation of massive training data. Most of the available studies on the effectiveness of crowdsourcing report on English data. We use Mechanical Turk annotations to train an Opinion Mining System to classify Spanish consumer comments. We design three different Human Intelligence Task (HIT) strategies and report high inter-annotator agreement between non-experts and expert annotators. We evaluate the advantages/drawbacks of each HIT design and show that, in our case, the use of non-expert annotations is a viable and cost-effective alternative to expert annotations.