Cardinality pruning and language model heuristics for hierarchical phrase-based translation

Authors:
David Vilar;Hermann Ney
Affiliations:
RWTH Aachen University, Aachen, Germany 52056 and Language Technology Lab, DFKI GmbH, Berlin, Germany 10559;RWTH Aachen University, Aachen, Germany 52056
Venue:
Machine Translation
Year:
2012

Citing 29
Cited 2

A systematic comparison of various statistical alignment models

Computational Linguistics
Word reordering and a dynamic programming beam search algorithm for statistical machine translation

Computational Linguistics
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Noun phrase translation

Noun phrase translation
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Programming languages and their compilers: Preliminary notes

Programming languages and their compilers: Preliminary notes
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Left-to-right target generation for hierarchical phrase-based translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Monte carlo inference and maximization for phrase-based translation

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Translation as weighted deduction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Coarse-to-fine syntactic machine translation using language projections

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical phrase-based translation with weighted finite state transducers

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Further meta-evaluation of machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
LIMSI's statistical translation systems for WMT'08

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Towards better machine translation quality for the German--English language pairs

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Joshua: an open source toolkit for parsing-based machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Variational decoding for statistical machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Cube pruning as heuristic search

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Training phrase translation models with leaving-one-out

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The RWTH Aachen machine translation system for WMT 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Reproducible results in parsing-based machine translation: the JHU shared task submission

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Jane: open source hierarchical translation, extended with reordering and lexicon models

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Tiburon: a weighted tree automata toolkit

CIAA'06 Proceedings of the 11th international conference on Implementation and Application of Automata

Jane: an advanced freely available hierarchical machine translation toolkit

Machine Translation
Language model rest costs and space-efficient storage

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article we present two novel enhancements for the cube pruning and cube growing algorithms, two of the most widely applied methods when using the hierarchical approach to statistical machine translation. Cube pruning is the de facto standard search algorithm for the hierarchical model. We propose to adapt concepts of the source cardinality synchronous search organization as used for standard phrase-based translation to the characteristics of cube pruning. In this way we will be able to improve the performance of the generation process and reduce the average translation time per sentence to approximately one quarter. We will also investigate the cube growing algorithm, a reformulation of cube pruning with on-demand computation. This algorithm depends on a heuristic for the language model, but this issue is barely discussed in the original work. We analyze the behaviour of this heuristic and propose a new one which greatly reduces memory consumption without costs in runtime or translation performance. Results are reported on the German---English Europarl corpus.