BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Confidence estimation for machine translation
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
(Meta-) evaluation of machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Linguistic features for automatic evaluation of heterogenous MT systems
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Textual entailment features for machine translation evaluation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Machine translation evaluation versus quality estimation
Machine Translation
Towards cross-lingual textual entailment
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Error detection for statistical machine translation using linguistic features
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Goodness: a method for measuring machine translation confidence
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Using bilingual parallel corpora for cross-lingual textual entailment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Detecting semantic equivalence and information disparity in cross-lingual documents
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Semeval-2012 task 8: cross-lingual textual entailment for content synchronization
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
FBK: cross-lingual textual entailment without translation
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Detecting semantic equivalence and information disparity in cross-lingual documents
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
We address two challenges for automatic machine translation evaluation: a) avoiding the use of reference translations, and b) focusing on adequacy estimation. From an economic perspective, getting rid of costly hand-crafted reference translations (a) permits to alleviate the main bottleneck in MT evaluation. From a system evaluation perspective, pushing semantics into MT (b) is a necessity in order to complement the shallow methods currently used overcoming their limitations. Casting the problem as a cross-lingual textual entailment application, we experiment with different benchmarks and evaluation settings. Our method shows high correlation with human judgements and good results on all datasets without relying on reference translations.