Distortion models for statistical machine translation

Authors:
Yaser Al-Onaizan;Kishore Papineni
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 13
Cited 35

A statistical approach to machine translation

Computational Linguistics
Word reordering and a dynamic programming beam search algorithm for statistical machine translation

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Decoding complexity in word-replacement translation models

Computational Linguistics
A DP based search using monotone alignments in statistical translation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A polynomial-time algorithm for statistical machine translation

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A unigram orientation model for statistical machine translation

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

A block bigram prediction model for statistical machine translation

ACM Transactions on Speech and Language Processing (TSLP)
Statistical machine translation

ACM Computing Surveys (CSUR)
Introducing a Translation Dictionary into Phrase-Based SMT

IEICE - Transactions on Information and Systems
Comparing and Integrating Alignment Template and Standard Phrase-Based Statistical Machine Translation

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Syntactic reordering integrated with phrase-based SMT

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Tera-scale translation models via pattern matching

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
When Harry met Harri: cross-lingual name spelling normalization

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generalizing local and non-local word-reordering patterns for syntax-based machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mention detection crossing the language barrier

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A simple and effective hierarchical phrase reordering model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using a dependency parser to improve SMT for subject-object-verb languages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A rule-driven dynamic programming decoder for statistical MT

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Syntactic reordering integrated with phrase-based SMT

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Improving Arabic-Chinese statistical machine translation using English as pivot language

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Cross-Language Information Propagation for Arabic Mention Detection

ACM Transactions on Asian Language Information Processing (TALIP)
Confidence measure for word alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Learning linear ordering problems for better translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A direct syntax-driven reordering model for phrase-based machine translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improved models of distortion cost for statistical machine translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Sentence correction incorporating relative position and parse template language models

IEEE Transactions on Audio, Speech, and Language Processing
Hierarchical phrase-based machine translation with word-based reordering model

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Constituent reordering and syntax models for English-to-Japanese statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Syntax based reordering with automatically derived rules for improved statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Sequence-based pronunciation modeling using a noisy-channel approach

IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
A power mean based algorithm for combining multiple alignment tables

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Linguistically annotated reordering: Evaluation and analysis

Computational Linguistics
Syntax-based reordering for statistical machine translation

Computer Speech and Language
Reordering with source language collocations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Improving reordering for statistical machine translation with smoothed priors and syntactic features

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
A word reordering model for improved machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Statistical machine translation with local language models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Chunk-lattices for verb reordering in Arabic---English statistical machine translation

Machine Translation
Modified distortion matrices for phrase-based statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Private access to phrase tables for statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrase-based SMT decoders to address those n-gram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity (or difference) between any pair of languages based on word alignments.