SPEDE: probabilistic edit distance metrics for MT evaluation

Authors:
Mengqiu Wang;Christopher D. Manning
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Year:
2012

Citing 23
Cited 2

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Introduction to algorithms

Introduction to algorithms
Translation with Finite-State Devices

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Probabilistic Finite-State Machines-Part I

IEEE Transactions on Pattern Analysis and Machine Intelligence
Parameter estimation for probabilistic finite-state transducers

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A weighted finite state transducer implementation of the alignment template model for statistical machine translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Quantitative Analysis of Probabilistic Pushdown Automata: Expectations and Variances

LICS '05 Proceedings of the 20th Annual IEEE Symposium on Logic in Computer Science
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Evaluating machine translation with LFG dependencies

Machine Translation
A re-examination on features in regression based approach to automatic MT evaluation

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Re-evaluating machine translation results with paraphrase support

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Findings of the 2009 workshop on statistical machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
A quantitative analysis of reordering phenomena

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Fluency, adequacy, or HTER?: exploring different human judgments with a tunable MT metric

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Robust machine translation evaluation with entailment features

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Extending the meteor machine translation evaluation metric to the phrase level

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
The approximate swap and mismatch edit distance

Theoretical Computer Science
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
The DCU dependency-based metric in WMT-MetricsMATR 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Findings of the 2011 Workshop on Statistical Machine Translation

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation

Stanford: probabilistic edit distance metrics for STS

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Findings of the 2012 workshop on statistical machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes Stanford University's submission to the Shared Evaluation Task of WMT 2012. Our proposed metric (SPEDE) computes probabilistic edit distance as predictions of translation quality. We learn weighted edit distance in a probabilistic finite state machine (pFSM) model, where state transitions correspond to edit operations. While standard edit distance models cannot capture long-distance word swapping or cross alignments, we rectify these shortcomings using a novel pushdown automaton extension of the pFSM model. Our models are trained in a regression framework, and can easily incorporate a rich set of linguistic features. Evaluated on two different prediction tasks across a diverse set of datasets, our methods achieve state-of-the-art correlation with human judgments.