Sentence level machine translation evaluation as a ranking problem: one step aside from BLEU

  • Authors:
  • Yang Ye;Ming Zhou;Chin-Yew Lin

  • Affiliations:
  • University of Michigan;Microsoft Research Asia;Microsoft Research Asia

  • Venue:
  • StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper proposes formulating MT evaluation as a ranking problem, as is often done in the practice of assessment by human. Under the ranking scenario, the study also investigates the relative utility of several features. The results show greater correlation with human assessment at the sentence level, even when using an n-gram match score as a baseline feature. The feature contributing the most to the rank order correlation between automatic ranking and human assessment was the dependency structure relation rather than BLEU score and reference language model feature.