Projecting parameters for multilingual word sense disambiguation

  • Authors:
  • Mitesh M. Khapra;Sapan Shah;Piyush Kedia;Pushpak Bhattacharyya

  • Affiliations:
  • Indian Institute of Technology, Mumbai, Maharashtra, India;Indian Institute of Technology, Mumbai, Maharashtra, India;Indian Institute of Technology, Mumbai, Maharashtra, India;Indian Institute of Technology, Mumbai, Maharashtra, India

  • Venue:
  • EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report in this paper a way of doing Word Sense Disambiguation (WSD) that has its origin in multilingual MT and that is cognizant of the fact that parallel corpora, wordnets and sense annotated corpora are scarce resources. With respect to these resources, languages show different levels of readiness; however a more resource fortunate language can help a less resource fortunate language. Our WSD method can be applied to a language even when no sense tagged corpora for that language is available. This is achieved by projecting wordnet and corpus parameters from another language to the language in question. The approach is centered around a novel synset based multilingual dictionary and the empirical observation that within a domain the distribution of senses remains more or less invariant across languages. The effectiveness of our approach is verified by doing parameter projection and then running two different WSD algorithms. The accuracy values of approximately 75% (F1-score) for three languages in two different domains establish the fact that within a domain it is possible to circumvent the problem of scarcity of resources by projecting parameters like sense distributions, corpus-co-occurrences, conceptual distance, etc. from one language to another.