UCB system description for the WMT 2007 shared task

  • Authors:
  • Preslav Nakov;Marti Hearst

  • Affiliations:
  • University of California at Berkeley, Berkeley, CA;University of California at Berkeley, Berkeley, CA

  • Venue:
  • StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

For the WMT 2007 shared task, the UC Berkeley team employed three techniques of interest. First, we used monolingual syntactic paraphrases to provide syntactic variety to the source training set sentences. Second, we trained two language models: a small in-domain model and a large out-of-domain model. Finally, we made use of results from prior research that shows that cognate pairs can improve word alignments. We contributed runs translating English to Spanish, French, and German using various combinations of these techniques.