Maximum entropy models for word sense disambiguation

  • Authors:
  • Gerald Chao;Michael G. Dyer

  • Affiliations:
  • University of California, Los Angeles, Los Angeles, California;University of California, Los Angeles, Los Angeles, California

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

A maximum entropy-based word sense disambiguation system is presented, consisting of individual word experts that are trained on both labeled and partially labeled corpora. The classification probabilities from the individual word experts are integrated using a new search algorithm, which balances time complexity and accuracy. The model is evaluated using established procedures on the English-all-words task from the SENSEVAL-2 workshop, a large test set consisting of words from all word groups to be disambiguated. Lastly, an ongoing project that integrates POS tagging, parsing, and sense disambiguation in one system is presented. Once in place, it will be boot-strapped with existing partially labeled corpora, to process and then train from them. The goal is to show that with each successive iteration, the accuracy of all three processes, POS tagging, parsing, and WSD, will improve as the system learns from more accurate, self-generated training data.