Syntax based reordering with automatically derived rules for improved statistical machine translation

Authors:
Karthik Visweswariah;Jiri Navratil;Jeffrey Sorensen;Vijil Chenthamarakshan;Nanda Kambhatla
Affiliations:
IBM Research;IBM Research;Google, Inc.;IBM Research;IBM Research
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Year:
2010

Citing 16
Cited 6

Learning to Parse Natural Language with Maximum Entropy Models

Machine Learning - Special issue on natural language learning
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Distortion models for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A maximum entropy word aligner for Arabic-English machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A systematic comparison of phrase-based, hierarchical and syntax-augmented statistical MT

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Syntactic phrase reordering for English-to-Arabic statistical machine translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A unigram orientation model for statistical machine translation

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Using a dependency parser to improve SMT for subject-object-verb languages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
The RWTH machine translation system for WMT 2009

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Reordering with source language collocations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
ILLC-UvA translation system for EMNLP-WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
A word reordering model for improved machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A ranking-based approach to word reordering for statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Statistical translation after source reordering: Oracles, context-aware models, and empirical analysis

Natural Language Engineering
Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Syntax based reordering has been shown to be an effective way of handling word order differences between source and target languages in Statistical Machine Translation (SMT) systems. We present a simple, automatic method to learn rules that reorder source sentences to more closely match the target language word order using only a source side parse tree and automatically generated alignments. The resulting rules are applied to source language inputs as a pre-processing step and demonstrate significant improvements in SMT systems across a variety of languages pairs including English to Hindi, English to Spanish and English to French as measured on a variety of internal test sets as well as a public test set.