Dependency-based automatic evaluation for machine translation

Authors:
Karolina Owczarzak;Josef van Genabith;Andy Way
Affiliations:
Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland
Venue:
SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Year:
2007

Citing 6
Cited 11

A systematic comparison of various statistical alignment models

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Long-distance dependency resolution in automatically acquired wide-coverage PCFG-based LFG approximations

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Paraphrasing for automatic evaluation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Contextual bitext-derived paraphrases in automatic MT evaluation

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Evaluating machine translation with LFG dependencies

Machine Translation
Regression for machine translation evaluation at the sentence level

Machine Translation
References extension for the automatic evaluation of MT by syntactic hybridization

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Textual entailment features for machine translation evaluation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
On the robustness of syntactic and semantic features for automatic MT evaluation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
ATEC: automatic evaluation of machine translation via word choice and word order

Machine Translation
Automated metrics for speech translation

PerMIS '09 Proceedings of the 9th Workshop on Performance Metrics for Intelligent Systems
Linguistic measures for automatic machine translation evaluation

Machine Translation
Corroborating text evaluation results with heterogeneous measures

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Evaluation of 2-way Iraqi Arabic---English speech translation systems using automated metrics

Machine Translation
Evaluation methodology and metrics employed to assess the TRANSTAC two-way, speech-to-speech translation systems

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel method for evaluating the output of Machine Translation (MT), based on comparing the dependency structures of the translation and reference rather than their surface string forms. Our method uses a treebank-based, widecoverage, probabilistic Lexical-Functional Grammar (LFG) parser to produce a set of structural dependencies for each translation-reference sentence pair, and then calculates the precision and recall for these dependencies. Our dependency-based evaluation, in contrast to most popular string-based evaluation metrics, will not unfairly penalize perfectly valid syntactic variations in the translation. In addition to allowing for legitimate syntactic differences, we use paraphrases in the evaluation process to account for lexical variation. In comparison with other metrics on 16,800 sentences of Chinese-English newswire text, our method reaches high correlation with human scores. An experiment with two translations of 4,000 sentences from Spanish-English Europarl shows that, in contrast to most other metrics, our method does not display a high bias towards statistical models of translation.