Improving a statistical MT system with automatically learned rewrite patterns

Authors:
Fei Xia;Michael McCord
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 18
Cited 76

Design of LMT: a prolog-based machine translation system

Computational Linguistics
Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars

Proceedings of the International Symposium on Natural Language and Logic
Slot grammars

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Structural matching of parallel texts

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A polynomial-time algorithm for statistical machine translation

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Chart-based transfer rule application in Machine Translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Learning translation templates from bilingual text

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding structural correspondences from bilingual parsed corpus for corpus-based translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A phrase-based unigram model for statistical machine translation

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Feedback cleaning of machine translation rules using automatic evaluation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Heuristics for broad-coverage natural language parsing

HLT '93 Proceedings of the workshop on Human Language Technology
Inducing lexico-structural transfer rules from parsed Bi-texts

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10

Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Distortion models for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Optimal constituent alignment with edge covers for semantic projection

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Relabeling syntax trees to improve syntax-based machine translation quality

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Grammatical machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Improving statistical MT by coupling reordering and decoding

Machine Translation
Improving phrase-based statistical machine translation with morphosyntactic transformation

Machine Translation
A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
An Ngram-based reordering model

Computer Speech and Language
Syntactic reordering integrated with phrase-based SMT

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Sentence type based reordering model for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Syntactic phrase reordering for English-to-Arabic statistical machine translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Improving mid-range reordering using templates of factors

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A discriminative model for tree-to-tree translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Generalizing local and non-local word-reordering patterns for syntax-based machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using a dependency parser to improve SMT for subject-object-verb languages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Syntactic reordering for English-Arabic phrase-based machine translation

Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
A rule-driven dynamic programming decoder for statistical MT

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Syntactic reordering integrated with phrase-based SMT

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Coupling hierarchical word reordering and decoding in phrase-based statistical machine translation

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Using shallow syntax information to improve word alignment and reordering for SMT

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Experiments in morphosyntactic processing for translating to and from German

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Chinese syntactic reordering for adequate generation of Korean verbal phrases in Chinese-to-Korean SMT

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Disambiguating "DE" for Chinese-English machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Reordering: a stepping-stone to perfect Thai Sign generation

CI '07 Proceedings of the Third IASTED International Conference on Computational Intelligence
Tree kernel-based SVM with structured syntactic knowledge for BTG-based phrase reordering

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Learning linear ordering problems for better translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Symbolic-to-statistical hybridization: extending generation-heavy machine translation

Machine Translation
Cross-lingual annotation projection of semantic roles

Journal of Artificial Intelligence Research
Improving phrase-based translation with prototypes of short phrases

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A direct syntax-driven reordering model for phrase-based machine translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A data mining approach to learn reorder rules for SMT

HLT-SRWS '10 Proceedings of the NAACL HLT 2010 Student Research Workshop
Improving Arabic-to-English statistical machine translation by reordering post-verbal subjects for alignment

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Exploiting morphology and local word reordering in English-to-Turkish phrase-based statistical machine translation

IEEE Transactions on Audio, Speech, and Language Processing
Vs and OOVs: two problems for translation between German and English

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Head finalization: a simple reordering rule for SOV languages

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Divide and translate: improving long distance reordering in statistical machine translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
A discriminative latent variable-based "DE" classifier for Chinese--English SMT

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatically learning source-side reordering rules for large scale machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hierarchical phrase-based machine translation with word-based reordering model

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Constituent reordering and syntax models for English-to-Japanese statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Syntax based reordering with automatically derived rules for improved statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Exploiting syntactic relationships in a phrase-based decoder: an exploration

Machine Translation
Improving reordering with linguistically informed bilingual n-grams

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Linguistically annotated reordering: Evaluation and analysis

Computational Linguistics
Syntax-based reordering for statistical machine translation

Computer Speech and Language
Pre- and postprocessing for statistical machine translation into Germanic languages

HLT-SS '11 Proceedings of the ACL 2011 Student Session
Clause restructuring for SMT not absolutely helpful

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Improving reordering for statistical machine translation with smoothed priors and syntactic features

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Improved Chinese--English SMT with Chinese “DE” Construction Classification and Reordering

ACM Transactions on Asian Language Information Processing (TALIP)
A syntactic transformation model for statistical machine translation

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
A lightweight evaluation framework for machine translation reordering

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Fuzzy syntactic reordering for phrase-based statistical machine translation

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
ILLC-UvA translation system for EMNLP-WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
From n-gram-based to CRF-based translation models

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Training a parser for machine translation reordering

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Inducing sentence structure from parallel corpora for reordering

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A word reordering model for improved machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment

Machine Translation
Syntactic structure transfer in a tamil to hindi MT system – a hybrid approach

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
HPSG-Based Preprocessing for English-to-Japanese Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Clustered word classes for preordering in statistical machine translation

ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Modified distortion matrices for phrase-based statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Post-ordering by parsing for Japanese-English statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Inducing a discriminative parser to optimize machine translation reordering

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Head finalization reordering for Chinese-to-Japanese machine translation

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
CCG syntactic reordering models for phrase-based machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Statistical translation after source reordering: Oracles, context-aware models, and empirical analysis

Natural Language Engineering
Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Statistical machine translation enhancements through linguistic levels: A survey

ACM Computing Surveys (CSUR)
Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current clump-based statistical MT systems have two limitations with respect to word ordering: First, they lack a mechanism for expressing and using generalization that accounts for reorderings of linguistic phrases. Second, the ordering of target words in such systems does not respect linguistic phrase boundaries. To address these limitations, we propose to use automatically learned rewrite patterns to preprocess the source sentences so that they have a word order similar to that of the target language. Our system is a hybrid one. The basic model is statistical, but we use broad-coverage rule-based parsers in two ways - during training for learning rewrite patterns, and at runtime for reordering the source sentences. Our experiments show 10% relative improvement in Bleu measure.