NUS at WMT09: domain adaptation experiments for English-Spanish machine translation of news commentary text

  • Authors:
  • Preslav Nakov;Hwee Tou Ng

  • Affiliations:
  • National University of Singapore, Singapore;National University of Singapore, Singapore

  • Venue:
  • StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the system developed by the team of the National University of Singapore for English to Spanish machine translation of News Commentary text for the WMT09 Shared Translation Task. Our approach is based on domain adaptation, combining a small in-domain News Commentary bi-text and a large out-of-domain one from the Europarl corpus, from which we built and combined two separate phrase tables. We further combined two language models (in-domain and out-of-domain), and we experimented with cognates, improved tokenization and recasing, achieving the highest lowercased NIST score of 6.963 and the second best lowercased Bleu score of 24.91% for training without using additional external data for English-to-Spanish translation at the shared task.