Unsupervised sense disambiguation using bilingual probabilistic models

  • Authors:
  • Indrajit Bhattacharya;Lise Getoor;Yoshua Bengio

  • Affiliations:
  • University of Maryland, College Park, MD;University of Maryland, College Park, MD;Université de Montréal, Montréal, Québec, Canada

  • Venue:
  • ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe two probabilistic models for unsupervised word-sense disambiguation using parallel corpora. The first model, which we call the Sense model, builds on the work of Diab and Resnik (2002) that uses both parallel text and a sense inventory for the target language, and recasts their approach in a probabilistic framework. The second model, which we call the Concept model, is a hierarchical model that uses a concept latent variable to relate different language specific sense labels. We show that both models improve performance on the word sense disambiguation task over previous unsupervised approaches, with the Concept model showing the largest improvement. Furthermore, in learning the Concept model, as a by-product, we learn a sense inventory for the parallel language.