Improved models of distortion cost for statistical machine translation

Authors:
Spence Green;Michel Galley;Christopher D. Manning
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 24
Cited 6

A polynomial-time algorithm for statistical machine translation

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A localized prediction model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Distortion models for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A systematic comparison of phrase-based, hierarchical and syntax-augmented statistical MT

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A simple and effective hierarchical phrase reordering model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A unigram orientation model for statistical machine translation

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Arabic preprocessing schemes for statistical machine translation

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Coupling hierarchical word reordering and decoding in phrase-based statistical machine translation

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
A POS-based model for long-range reorderings in SMT

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
A systematic analysis of translation model search spaces

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Novel reordering approaches in phrase-based statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Discriminative reordering models for statistical machine translation

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Phrasal: a toolkit for statistical machine translation with facilities for extraction and incorporation of arbitrary model features

HLT-DEMO '10 Proceedings of the NAACL HLT 2010 Demonstration Session
Improving reordering for statistical machine translation with smoothed priors and syntactic features

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Chunk-lattices for verb reordering in Arabic---English statistical machine translation

Machine Translation
Modified distortion matrices for phrase-based statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Direct error rate minimization for statistical machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The distortion cost function used in Moses-style machine translation systems has two flaws. First, it does not estimate the future cost of known required moves, thus increasing search errors. Second, all distortion is penalized linearly, even when appropriate re-orderings are performed. Because the cost function does not effectively constrain search, translation quality decreases at higher distortion limits, which are often needed when translating between languages of different typologies such as Arabic and English. To address these problems, we introduce a method for estimating future linear distortion cost, and a new discriminative distortion model that predicts word movement during translation. In combination, these extensions give a statistically significant improvement over a baseline distortion parameterization. When we triple the distortion limit, our model achieves a +2.32 BLEU average gain over Moses.