The University of Maryland statistical machine translation system for the Fourth Workshop on Machine Translation

  • Authors:
  • Chris Dyer;Hendra Setiawan;Yuval Marton;Philip Resnik

  • Affiliations:
  • University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD

  • Venue:
  • StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as input, explicit generation of beginning and end of sentence markers, minimum Bayes risk decoding, and incorporation of a feature scoring the alignment of function words in the hypothesized translation. We also explored the use of monolingual paraphrases to improve coverage, as well as co-training to improve the quality of the segmentation lattices used, but these did not lead to improvements.