Further meta-evaluation of machine translation

  • Authors:
  • Chris Callison-Burch;Cameron Fordyce;Philipp Koehn;Christof Monz;Josh Schroeder

  • Affiliations:
  • Johns Hopkins University;University of Edinburgh;University of Edinburgh;University of London;University of Edinburgh

  • Venue:
  • StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper analyzes the translation quality of machine translation systems for 10 language pairs translating between Czech, English, French, German, Hungarian, and Spanish. We report the translation quality of over 30 diverse translation systems based on a large-scale manual evaluation involving hundreds of hours of effort. We use the human judgments of the systems to analyze automatic evaluation metrics for translation quality, and we report the strength of the correlation with human judgments at both the system-level and at the sentence-level. We validate our manual evaluation methodology by measuring intra- and inter-annotator agreement, and collecting timing information.