Automatic evaluation of translation quality for distant language pairs

  • Authors:
  • Hideki Isozaki;Tsutomu Hirao;Kevin Duh;Katsuhito Sudoh;Hajime Tsukada

  • Affiliations:
  • NTT Corporation, Sorakugun, Kyoto, Japan;NTT Corporation, Sorakugun, Kyoto, Japan;NTT Corporation, Sorakugun, Kyoto, Japan;NTT Corporation, Sorakugun, Kyoto, Japan;NTT Corporation, Sorakugun, Kyoto, Japan

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic evaluation of Machine Translation (MT) quality is essential to developing high-quality MT systems. Various evaluation metrics have been proposed, and BLEU is now used as the de facto standard metric. However, when we consider translation between distant language pairs such as Japanese and English, most popular metrics (e.g., BLEU, NIST, PER, and TER) do not work well. It is well known that Japanese and English have completely different word orders, and special care must be paid to word order in translation. Otherwise, translations with wrong word order often lead to misunderstanding and incomprehensibility. For instance, SMT-based Japanese-to-English translators tend to translate 'A because B' as 'B because A.' Thus, word order is the most important problem for distant language translation. However, conventional evaluation metrics do not significantly penalize such word order mistakes. Therefore, locally optimizing these metrics leads to inadequate translations. In this paper, we propose an automatic evaluation metric based on rank correlation coefficients modified with precision. Our meta-evaluation of the NTCIR-7 PATMT JE task data shows that this metric outperforms conventional metrics.