A projection extension algorithm for statistical machine translation

Authors:
Christoph Tillmann
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Year:
2003

Citing 11
Cited 20

Word reordering and a dynamic programming beam search algorithm for statistical machine translation

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Inducing multilingual text analysis tools via robust projection across aligned corpora

HLT '01 Proceedings of the first international conference on Human language technology research
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research

The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A word-to-phrase statistical translation model

ACM Transactions on Speech and Language Processing (TSLP)
A weighted finite state transducer translation template model for statistical machine translation

Natural Language Engineering
Scaling phrase-based statistical machine translation to larger corpora and longer phrases

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Inner-outer bracket models for word alignment using hidden blocks

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word-level confidence estimation for machine translation using phrase-based translation models

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word-Level Confidence Estimation for Machine Translation

Computational Linguistics
A block bigram prediction model for statistical machine translation

ACM Transactions on Speech and Language Processing (TSLP)
The scaling problem in the pattern recognition approach to machine translation

Pattern Recognition Letters
Generalizing local and non-local word-reordering patterns for syntax-based machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Morphological analysis for statistical machine translation

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
An integrated approach for Arabic-English named entity translation

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Meta-structure transformation model for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Efficient dynamic programming search algorithms for phrase-based SMT

CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
A generalized alignment-free phrase extraction

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Automatic segmentation of bilingual corpora: a comparison of different techniques

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Phrase-based statistical machine translation: a level of detail approach

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Toward statistical machine translation without parallel corpora

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Using categorial grammar to label translation rules

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe a phrase-based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models. The units of translation are blocks -- pairs of phrases. During decoding, we use a block unigram model and a word-based trigram language model. During training, the blocks are learned from source interval projections using an underlying high-precision word alignment. The system performance is significantly increased by applying a novel block extension algorithm using an additional high-recall word alignment. The blocks are further filtered using unigram-count selection criteria. The system has been successfully test on a Chinese-English and an Arabic-English translation task.