A decoder for syntax-based statistical MT

Authors:
Kenji Yamada;Kevin Knight
Affiliations:
University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA
Venue:
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Year:
2002

Citing 10
Cited 34

Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Learning dependency translation models as collections of finite-state head transducers

Computational Linguistics - Special issue on finite-state methods in NLP
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Forest-based statistical sentence generation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A syntax-based statistical translation model

A syntax-based statistical translation model
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics

Translation by the Numbers: Language Weaver

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A phrase-based unigram model for statistical machine translation

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Loosely tree-based alignment for machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Using 'smart' bilingual projection to feature-tag a monolingual dictionary

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A projection extension algorithm for statistical machine translation

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Statistical machine translation using coercive two-level syntactic transduction

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Statistical machine translation by parsing

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Machine translation using probabilistic synchronous dependency insertion grammars

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Distortion models for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Syntax-based alignment: supervised or unsupervised?

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Translating with non-contiguous phrases

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The Hiero machine translation system: extensions, evaluation, and analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Interactively exploring a machine translation model

ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Improving statistical MT by coupling reordering and decoding

Machine Translation
Statistical machine translation

ACM Computing Surveys (CSUR)
Training tree transducers

Computational Linguistics
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
English-to-Czech factored machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Book review:

Computational Linguistics
Accuracy-based scoring for DOT: towards direct error minimization for data-oriented translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A monolingual tree-based translation model for sentence simplification

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Compositions of top-down tree transducers with ε-rules

FSMNLP'09 Proceedings of the 8th international conference on Finite-state methods and natural language processing
Short communication: A SomAgent statistical machine translation

Applied Soft Computing
Re-structuring, re-labeling, and re-aligning for syntax-based machine translation

Computational Linguistics
Tree transformations and dependencies

MOL'11 Proceedings of the 12th biennial conference on The mathematics of language
Automatic learning of parallel dependency treelet pairs

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
A word reordering model for improved machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Survey: weighted extended top-down tree transducers part iii - composition

Algebraic Foundations in Computer Science
Survey: Weighted Extended Top-down Tree Transducers Part II—Application in Machine Translation

Fundamenta Informaticae - Non-Classical Models of Automata and Applications II
Clustered word classes for preordering in statistical machine translation

ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Statistical machine translation enhancements through linguistic levels: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a decoding algorithm for a syntax-based translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional word-to-word statistical model, a decoder for the syntax-based model builds up an English parse tree given a sentence in a foreign language. As the model size becomes huge in a practical setting, and the decoder considers multiple syntactic structures for each word alignment, several pruning techniques are necessary. We tested our decoder in a Chinese-to-English translation system, and obtained better results than IBM Model 4. We also discuss issues concerning the relation between this decoder and a language model.