Automatic word sense disambiguation using cooccurrence and hierarchical information

  • Authors:
  • David Fernandez-Amoros;Ruben Heradio Gil;Jose Antonio Cerrada Somolinos;Carlos Cerrada Somolinos

  • Affiliations:
  • ETSI Informatica, Universidad Nacional de Educacion a Distancia, Madrid, Spain;ETSI Informatica, Universidad Nacional de Educacion a Distancia, Madrid, Spain;ETSI Informatica, Universidad Nacional de Educacion a Distancia, Madrid, Spain;ETSI Informatica, Universidad Nacional de Educacion a Distancia, Madrid, Spain

  • Venue:
  • NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We review in detail here a polished version of the systems with which we participated in the SENSEVAL-2 competition English tasks (all words and lexical sample). It is based on a combination of selectional preference measured over a large corpus and hierarchical information taken from WordNet, as well as some additional heuristics.We use that information to expand sense glosses of the senses in WordNet and compare the similarity between the contexts vectors and the word sense vectors in a way similar to that used by Yarowsky and Schuetze. A supervised extension of the system is also discussed. We provide new and previously unpublished evaluation over the SemCor collection, which is two orders of magnitude larger than SENSEVAL-2 collections as well as comparison with baselines. Our systems scored first among unsupervised systems in both tasks. We note that the method is very sensitive to the quality of the characterizations of word senses; glosses being much better than training examples.