How good is the crowd at "real" WSD?

  • Authors:
  • Jisup Hong;Collin F. Baker

  • Affiliations:
  • International Computer Science Institute, Berkeley, CA;International Computer Science Institute, Berkeley, CA

  • Venue:
  • LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

There has been a great deal of excitement recently about using the "wisdom of the crowd" to collect data of all kinds, quickly and cheaply (Howe, 2008; von Ahn and Dabbish, 2008). Snow et al. (Snow et al., 2008) were the first to give a convincing demonstration that at least some kinds of linguistic data can be gathered from workers on the web more cheaply than and as accurately as from local experts, and there has been a steady stream of papers and workshops since then with similar results. e.g. (Callison-Burch and Dredze, 2010). Many of the tasks which have been successfully crowdsourced involve judgments which are similar to those performed in everyday life, such as recognizing unclear writing (von Ahn et al., 2008), or, for those tasks that require considerable judgment, the responses are usually binary or from a small set of responses, such as sentiment analysis (Mellebeek et al., 2010) or ratings (Heilman and Smith, 2010). Since the FrameNet process is known to be relatively expensive, we were interested in whether the FrameNet process of fine word sense discrimination and marking of dependents with semantic roles could be performed more cheaply and equally accurately using Amazon's Mechanical Turk (AMT) or similar resources. We report on a partial success in this respect and how it was achieved.