Online word games for semantic data collection

  • Authors:
  • David Vickrey;Aaron Bronzan;William Choi;Aman Kumar;Jason Turner-Maier;Arthur Wang;Daphne Koller

  • Affiliations:
  • Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA

  • Venue:
  • EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Obtaining labeled data is a significant obstacle for many NLP tasks. Recently, online games have been proposed as a new way of obtaining labeled data; games attract users by being fun to play. In this paper, we consider the application of this idea to collecting semantic relations between words, such as hypernym/hyponym relationships. We built three online games, inspired by the real-life games of Scattergories™ and Taboo™. As of June 2008, players have entered nearly 800,000 data instances, in two categories. The first type of data consists of category/answer pairs ("Types of vehicle","car"), while the second is essentially free association data ("submarine","underwater"). We analyze both types of data in detail and discuss potential uses of the data. We show that we can extract from our data set a significant number of new hypernym/hyponym pairs not already found in WordNet.