Batch tuning strategies for statistical machine translation

Authors:
Colin Cherry;George Foster
Affiliations:
National Research Council Canada;National Research Council Canada
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 22
Cited 4

Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
ORANGE: a method for evaluating automatic evaluation metrics for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Minimum risk annealing for training log-linear models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A dual coordinate descent method for large-scale linear SVM

Proceedings of the 25th international conference on Machine learning
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lattice-based minimum error rate training for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
First- and second-order expectation semirings with applications to minimum-risk training on translation forests

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Consensus training for consensus decoding in machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Distributed training strategies for the structured perceptron

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Training phrase translation models with leaving-one-out

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A unified approach to minimum risk training and decoding

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Better hypothesis testing for statistical machine translation: controlling for optimizer instability

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Expected BLEU training for graphs: BBN system description for WMT11 system combination task

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Tuning as ranking

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured ramp loss minimization for machine translation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Structured ramp loss minimization for machine translation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
On hierarchical re-ordering and permutation parsing for phrase-based decoding

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Syntax-aware phrase-based statistical machine translation: system description

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Optimization strategies for online large-margin learning in machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been a proliferation of recent work on SMT tuning algorithms capable of handling larger feature sets than the traditional MERT approach. We analyze a number of these algorithms in terms of their sentence-level loss functions, which motivates several new approaches, including a Structured SVM. We perform empirical comparisons of eight different tuning strategies, including MERT, in a variety of settings. Among other results, we find that a simple and efficient batch version of MIRA performs at least as well as training online, and consistently outperforms other options.