Crowdsourcing for search and data mining
ACM SIGIR Forum
Quality through flow and immersion: gamifying crowdsourced relevance assessments
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Identifying top news using crowdsourcing
Information Retrieval
Hi-index | 0.00 |
I will discuss the repeated acquisition of "labels" for data items when the labeling is imperfect. Labels are values provided by humans for specified variables on data items, such as "PG-13" for "Adult Content Rating on this Web Page." With the increasing popularity of micro-outsourcing systems, such as Amazon's Mechanical Turk, it often is possible to obtain less-than-expert labeling at low cost. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction.