Training phrase translation models with leaving-one-out

Authors:
Joern Wuebker;Arne Mauser;Hermann Ney
Affiliations:
RWTH Aachen University, Germany;RWTH Aachen University, Germany;RWTH Aachen University, Germany
Venue:
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Year:
2010

Citing 14
Cited 17

A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Generation of word graphs in statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
The complexity of phrase alignment problems

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Minimum Bayes Risk decoding for BLEU

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Sampling alignment structure under a Bayesian translation model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lattice Minimum Bayes-Risk decoding for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Why generative phrase models underperform surface heuristics

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Constraining the phrase-based, joint probability statistical translation model

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

The RWTH Aachen machine translation system for WMT 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Investigations on translation model adaptation using monolingual data

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Joint WMT submission of the QUAERO project

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
The RWTH Aachen machine translation system for WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
From n-gram-based to CRF-based translation models

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Lightly-supervised training for hierarchical phrase-based machine translation

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Cache-based document-level statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Batch tuning strategies for statistical machine translation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Jane: an advanced freely available hierarchical machine translation toolkit

Machine Translation
Cardinality pruning and language model heuristics for hierarchical phrase-based translation

Machine Translation
Maximum expected BLEU training of phrase and lexicon translation models

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Learning translation consensus with structured label propagation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Forced derivation tree based model training to statistical machine translation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Re-training monolingual parser bilingually for syntactic SMT

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Selecting data for English-to-Czech machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Phrase model training for statistical machine translation with word lattices of preprocessing alternatives

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Leave-one-out phrase model training for large-scale deployment

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several attempts have been made to learn phrase translation probabilities for phrase-based statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with over-fitting. We describe a novel leaving-one-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained separately from other models used in translation, we include all components such as single word lexica and reordering models in training. Using this consistent training of phrase models we are able to achieve improvements of up to 1.4 points in BLEU. As a side effect, the phrase table size is reduced by more than 80%.