Knowledge-based WSD on specific domains: performing better than generic supervised WSD

  • Authors:
  • Eneko Agirre;Oier Lopez De Lacalle;Aitor Soroa

  • Affiliations:
  • Informatika Fakultatea, University of the Basque Country, Donostia, Basque Country;Informatika Fakultatea, University of the Basque Country, Donostia, Basque Country;Informatika Fakultatea, University of the Basque Country, Donostia, Basque Country

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper explores the application of knowledge-based Word Sense Disambiguation systems to specific domains, based on our state-of-the-art graph-based WSD system that uses the information in WordNet. Evaluation was performed over a publicly available domain-specific dataset of 41 words related to Sports and Finance, comprising examples drawn from three corpora: one balanced corpus (BNC), and two domain-specific corpora (news related to Sports and Finance). The results show that in all three corpora our knowledge-based WSD algorithm improves over previous results, and also over two state-of-the-art supervised WSD systems trained on SemCor, the largest publicly available annotated corpus. We also show that using related words as context, instead of the actual occurrence contexts, yields better results on the domain datasets, but not on the general one. Interestingly, the results are higher for domain-specific corpus than for the general corpus, raising prospects for improving current WSD systems when applied to specific domains.