Online word games for semantic data collection

Authors:
David Vickrey;Aaron Bronzan;William Choi;Aman Kumar;Jason Turner-Maier;Arthur Wang;Daphne Koller
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 6
Cited 4

Automatic labeling of semantic roles

Computational Linguistics
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Verbosity: a game for collecting common-sense facts

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Semantic taxonomy induction from heterogenous evidence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Adding predicate argument structure to the Penn TreeBank

HLT '02 Proceedings of the second international conference on Human Language Technology Research

Climate quiz: a web application for eliciting and validating knowledge from social networks

Proceedings of the 18th Brazilian symposium on Multimedia and the web
Semantics Discovery via Human Computation Games

International Journal on Semantic Web & Information Systems
Perspectives on crowdsourcing annotations for natural language processing

Language Resources and Evaluation
Crowdsourced Knowledge Acquisition: Towards Hybrid-Genre Workflows

International Journal on Semantic Web & Information Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Obtaining labeled data is a significant obstacle for many NLP tasks. Recently, online games have been proposed as a new way of obtaining labeled data; games attract users by being fun to play. In this paper, we consider the application of this idea to collecting semantic relations between words, such as hypernym/hyponym relationships. We built three online games, inspired by the real-life games of Scattergories™ and Taboo™. As of June 2008, players have entered nearly 800,000 data instances, in two categories. The first type of data consists of category/answer pairs ("Types of vehicle","car"), while the second is essentially free association data ("submarine","underwater"). We analyze both types of data in detail and discuss potential uses of the data. We show that we can extract from our data set a significant number of new hypernym/hyponym pairs not already found in WordNet.