Augmented mixture models for lexical disambiguation

  • Authors:
  • Silviu Cucerzan;David Yarowsky

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates several augmented mixture models that are competitive alternatives to standard Bayesian models and prove to be very suitable to word sense disambiguation and related classification tasks. We present a new classification correction technique that successfully addresses the problem of under-estimation of infrequent classes in the training data. We show that the mixture models are boosting-friendly and that both Adaboost and our original correction technique can improve the results of the raw model significantly, achieving state-of-the-art performance on several standard test sets in four languages. With substantially different output to Naïve Bayes and other statistical methods, the investigated models are also shown to be effective participants in classifier combination.