Direct error rate minimization for statistical machine translation

  • Authors:
  • Tagyoung Chung;Michel Galley

  • Affiliations:
  • University of Rochester, Rochester, NY;Microsoft Research, Redmond, WA

  • Venue:
  • WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Minimum error rate training is often the preferred method for optimizing parameters of statistical machine translation systems. MERT minimizes error rate by using a surrogate representation of the search space, such as N-best lists or hypergraphs, which only offer an incomplete view of the search space. In our work, we instead minimize error rate directly by integrating the decoder into the minimizer. This approach yields two benefits. First, the function being optimized is the true error rate. Second, it lets us optimize parameters of translations systems other than standard linear model features, such as distortion limit. Since integrating the decoder into the minimizer is often too slow to be practical, we also exploit statistical significance tests to accelerate the search by quickly discarding unpromising models. Experiments with a phrase-based system show that our approach is scalable, and that optimizing the parameters that MERT cannot handle brings improvements to translation results.