Tackling sparse data issue in machine translation evaluation

  • Authors:
  • Ondřej Bojar;Kamil Kos;David Mareček

  • Affiliations:
  • Charles University in Prague;Charles University in Prague;Charles University in Prague

  • Venue:
  • ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We illustrate and explain problems of n-grams-based machine translation (MT) metrics (e.g. BLEU) when applied to morphologically rich languages such as Czech. A novel metric SemPOS based on the deep-syntactic representation of the sentence tackles the issue and retains the performance for translation to English as well.