Two web-based approaches for noun sense disambiguation

  • Authors:
  • Paolo Rosso;Manuel Montes-y-Gómez;Davide Buscaldi;Aarón Pancardo-Rodríguez;Luis Villaseñor Pineda

  • Affiliations:
  • Dpto. de Sistemas Informáticos y Computación (DSIC), Universidad Politécnica de Valencia, Spain;Dpto. de Sistemas Informáticos y Computación (DSIC), Universidad Politécnica de Valencia, Spain;Dipartimento di Informatica e Scienze dell'Informazione (DISI), Università di Genova, Italy;Lab. de Tecnologías del Lenguaje, Instituto Nacional de Astrofísica, Optica y Electrónica, Mexico;Lab. de Tecnologías del Lenguaje, Instituto Nacional de Astrofísica, Optica y Electrónica, Mexico

  • Venue:
  • CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of the resolution of the lexical ambiguity seems to be stuck because of the knowledge acquisition bottleneck. Therefore, it is worthwhile to investigate the possibility of using the Web as a lexical resource. This paper explores two attempts of using Web counts collected through a search engine. The first approach calculates the hits of each possible synonym of the noun to disambiguate together with the nouns of the context. In the second approach the disambiguation of a noun uses a modifier adjective as supporting evidence. A better precision than the baseline was obtained using adjective-noun pairs, even if with a low recall. A comprehensive set of weighting formulae for combining Web counts was investigated in order to give a complete picture of what are the various possibilities, and what are the formulae that work best. The comparison across different search engines was also useful: Web counts, and consequently disambiguation results, were almost identical. Moreover, the Web seems to be more effective than the WordNet Domains lexical resource if integrated rather than stand-alone.