Wider context by using bilingual language models in machine translation

  • Authors:
  • Jan Niehues;Teresa Herrmann;Stephan Vogel;Alex Waibel

  • Affiliations:
  • Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany;Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany;Carnegie Mellon University;Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany, and Carnegie Mellon University

  • Venue:
  • WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In past Evaluations for Machine Translation of European Languages, it could be shown that the translation performance of SMT systems can be increased by integrating a bilingual language model into a phrase-based SMT system. In the bilingual language model, target words with their aligned source words build the tokens of an n-gram based language model. We analyzed the effect of bilingual language models and show where they could help to better model the translation process. We could show improvements of translation quality on German-to-English and Arabic-to-English. In addition, for the Arabic-to-English task, training an extra bilingual language model on the POS tags instead of the surface word forms led to further improvements.