Normalized compression distance based measures for MetricsMATR 2010

  • Authors:
  • Marcus Dobrinkat;Jaakko Väyrynen;Tero Tapiovaara;Kimmo Kettunen

  • Affiliations:
  • Aalto University School of Science and Technology, Aalto, Finland;Aalto University School of Science and Technology, Aalto, Finland;Aalto University School of Science and Technology, Aalto, Finland;Kymenlaakso University of Applied Sciences, Kotka, Finland

  • Venue:
  • WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present the MT-NCD and MT-mNCD machine translation evaluation metrics as submission to the machine translation evaluation shared task (MetricsMATR 2010). The metrics are based on normalized compression distance (NCD), a general information theoretic measure of string similarity, and evaluated against human judgments from the WMT08 shared task. The experiments show that 1) our metric improves correlation to human judgments by using flexible matching, 2) segment replication is effective, and 3) our NCD-inspired method for multiple references indicates improved results. Generally, the proposed MT-NCD and MT-mNCD methods correlate competitively with human judgments compared to commonly used machine translations evaluation metrics, for instance, BLEU.