Block edit models for approximate string matching
Theoretical Computer Science - Special issue: Latin American theoretical informatics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Paraphrasing for automatic evaluation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Re-evaluating machine translation results with paraphrase support
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Syntactic constraints on paraphrases extracted from parallel corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using paraphrases for parameter tuning in statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Findings of the 2009 workshop on statistical machine translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Robust machine translation evaluation with entailment features
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Exploiting comparable corpora with TER and TERp
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
HLT-DEMO '10 Proceedings of the NAACL HLT 2010 Demonstration Session
Extending the meteor machine translation evaluation metric to the phrase level
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
The best lexical metric for phrase-based statistical MT system optimization
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
An augmented three-pass system combination framework: DCU combination system for WMT 2010
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
BBN system description for WMT10 system combination task
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
METEOR-NEXT and the METEOR paraphrase tables: improved evaluation support for five target languages
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
TESLA: translation evaluation of sentences with linear-programming-based analysis
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
The parameter-optimized ATEC metric for MT evaluation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Facilitating translation using source language paraphrase lattices
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Further meta-evaluation of broad-coverage surface realization
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
PEM: a paraphrase evaluation metric exploiting parallel texts
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using bilingual parallel corpora for cross-lingual textual entailment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Incorporating source-language paraphrases into phrase-based SMT with confusion networks
SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Parallel sentence generation from comparable corpora for improved SMT
Machine Translation
AMBER: a modified BLEU, enhanced ranking metric
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
TESLA at WMT 2011: translation evaluation and tunable metric
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Description of the JHU system combination scheme for WMT 2011
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Correcting semantic collocation errors with L1-induced paraphrases
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning to simplify sentences with quasi-synchronous grammar and integer programming
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
ATT-0: submission to generation challenges 2011 surface realization: shared task
ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
HyTER: meaning-equivalent semantics for translation evaluation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
ETS: discriminative edit models for paraphrase scoring
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Exploring grammatical error correction with not-so-crummy machine translation
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
PORT: a precision-order-recall MT evaluation metric for tuning
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Probabilistic finite state machines for regression-based MT evaluation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Extending machine translation evaluation metrics with lexical cohesion to document level
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
SPEDE: probabilistic edit distance metrics for MT evaluation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Review of hypothesis alignment algorithms for MT system combination via confusion network decoding
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Machine learning for hybrid machine translation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Oracle decoding as a new way to analyze phrase-based machine translation
Machine Translation
Lattice BLEU oracles in machine translation
ACM Transactions on Speech and Language Processing (TSLP)
Hi-index | 0.00 |
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the correlation of the scores they assign to MT output with human judgments of translation performance. Different types of human judgments, such as Fluency, Adequacy, and HTER, measure varying aspects of MT performance that can be captured by automatic MT metrics. We explore these differences through the use of a new tunable MT metric: TER-Plus, which extends the Translation Edit Rate evaluation metric with tunable parameters and the incorporation of morphology, synonymy and paraphrases. TER-Plus was shown to be one of the top metrics in NIST's Metrics MATR 2008 Challenge, having the highest average rank in terms of Pearson and Spearman correlation. Optimizing TER-Plus to different types of human judgments yields significantly improved correlations and meaningful changes in the weight of different types of edits, demonstrating significant differences between the types of human judgments.