Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results

  • Authors:
  • Kishore Papineni;Salim Roukos;Todd Ward;John Henderson;Florence Reeder

  • Affiliations:
  • IBM T. J. Watson Research Center;IBM T. J. Watson Research Center;IBM T. J. Watson Research Center;MITRE;MITRE

  • Venue:
  • HLT '02 Proceedings of the second international conference on Human Language Technology Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe two metrics for automatic evaluation of machine translation quality. These metrics, BLEU and NEE, are compared to human judgment of quality of translation of Arabic, Chinese, French, and Spanish documents into English.