Discriminative instance weighting for domain adaptation in statistical machine translation

  • Authors:
  • George Foster;Cyril Goutte;Roland Kuhn

  • Affiliations:
  • National Research Council Canada, Gatineau, QC;National Research Council Canada, Gatineau, QC;National Research Council Canada, Gatineau, QC

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines.