Domain adaptation techniques for machine translation and their evaluation in a real-world setting

  • Authors:
  • Baskaran Sankaran;Majid Razmara;Atefeh Farzindar;Wael Khreich;Fred Popowich;Anoop Sarkar

  • Affiliations:
  • School of Computing Science, Simon Fraser University, Burnaby, BC, Canada;School of Computing Science, Simon Fraser University, Burnaby, BC, Canada;NLP Technologies Inc., Montreal, QC, Canada;NLP Technologies Inc., Montreal, QC, Canada;School of Computing Science, Simon Fraser University, Burnaby, BC, Canada;School of Computing Science, Simon Fraser University, Burnaby, BC, Canada

  • Venue:
  • Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Statistical Machine Translation (SMT) is currently used in real-time and commercial settings to quickly produce initial translations for a document which can later be edited by a human. The SMT models specialized for one domain often perform poorly when applied to other domains. The typical assumption that both training and testing data are drawn from the same distribution no longer applies. This paper evaluates domain adaptation techniques for SMT systems in the context of end-user feedback in a real world application. We present our experiments using two adaptive techniques, one relying on log-linear models and the other using mixture models. We describe our experimental results on legal and government data, and present the human evaluation effort for post-editing in addition to traditional automated scoring techniques (BLEU scores). The human effort is based primarily on the amount of time and number of edits required by a professional post-editor to improve the quality of machine-generated translations to meet industry standards. The experimental results in this paper show that the domain adaptation techniques can yield a significant increase in BLEU score (up to four points) and a significant reduction in post-editing time of about one second per word.