SemEval-2010 task 3: cross-lingual word sense disambiguation

  • Authors:
  • Els Lefever;Veronique Hoste

  • Affiliations:
  • University College Ghent, Groot-Brittanniëlaan, Belgium and Ghent University, Krijgslaan, Belgium;University College Ghent, Groot-Brittanniëlaan, Belgium and Ghent University, Krijgslaan, Belgium

  • Venue:
  • DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sensetagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the Europarl parallel corpus. The multilingual setup involves the translations of a given English polysemous noun in five supported languages, viz. Dutch, French, German, Spanish and Italian. The task targets the following goals: (a) the manual creation of a multilingual sense inventory for a lexical sample of English nouns and (b) the evaluation of systems on their ability to disambiguate new occurrences of the selected polysemous nouns. For the creation of the hand-tagged gold standard, all translations of a given polysemous English noun are retrieved in the five languages and clustered by meaning. Systems can participate in 5 bilingual evaluation subtasks (English -- Dutch, English -- German, etc.) and in a multilingual subtask covering all language pairs. As WSD from cross-lingual evidence is gaining popularity, we believe it is important to create a multilingual gold standard and run cross-lingual WSD benchmark tests.