Word sense disambiguation using a second language monolingual corpus

  • Authors:
  • Ido Dagan;Alon Itai

  • Affiliations:
  • AT&T Bell Laboratories;Technion---Israel Institute of Technology

  • Venue:
  • Computational Linguistics
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of target word selection in machine translation, for which the approach is directly applicable. The presented algorithm identifies syntactic relations between words, using a source language parser, and maps the alternative interpretations of these relations to the target language, using a bilingual lexicon. The preferred senses are then selected according to statistics on lexical relations in the target language. The selection is based on a statistical model and on a constraint propagation algorithm, which simultaneously handles all ambiguities in the sentence. The method was evaluated using three sets of Hebrew and German examples and was found to be very useful for disambiguation. The paper includes a detailed comparative analysis of statistical sense disambiguation methods.