BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLANC: learning evaluation metrics for MT
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Some Improvements over the BLEU Metric for Measuring Translation Quality for Hindi
ICCTA '07 Proceedings of the International Conference on Computing: Theory and Applications
Dependency-based automatic evaluation for machine translation
SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Hi-index | 0.00 |
In this paper, we describe automated measures used to evaluate machine translation quality in the Defense Advanced Research Projects Agency's Spoken Language Communication and Translation System for Tactical Use program, which is developing speech translation systems for dialogue between English and Iraqi Arabic speakers in military contexts. Limitations of the automated measures are illustrated along with variants of the measures that seek to overcome those limitations. Both the dialogue structure of the data and the Iraqi Arabic language challenge these measures, and the paper presents some solutions adopted by MITRE and NIST to improve confidence in the scores.