Phrase translation probabilities with ITG priors and smoothing as learning objective

Authors:
Markos Mylonakis;Khalil Sima'an
Affiliations:
University of Amsterdam;University of Amsterdam
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 17
Cited 4

Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
A systematic comparison of various statistical alignment models

Computational Linguistics
Parsing inside-out

Parsing inside-out
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Synchronous binarization for machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Design of the moses decoder for statistical machine translation

SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
An iteratively-trained segmentation-free phrase translation model for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Why generative phrase models underperform surface heuristics

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Constraining the phrase-based, joint probability statistical translation model

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Binarization of synchronous context-free grammars

Computational Linguistics

Learning probabilistic synchronous CFGs for phrase-based translation

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Translation model generalization using probability averaging for machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Learning hierarchical translation structure with linguistic annotations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Phrase model training for statistical machine translation with word lattices of preprocessing alternatives

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The conditional phrase translation probabilities constitute the principal components of phrase-based machine translation systems. These probabilities are estimated using a heuristic method that does not seem to optimize any reasonable objective function of the word-aligned, parallel training corpus. Earlier efforts on devising a better understood estimator either do not scale to reasonably sized training data, or lead to deteriorating performance. In this paper we explore a new approach based on three ingredients (1) A generative model with a prior over latent segmentations derived from Inversion Transduction Grammar (ITG), (2) A phrase table containing all phrase pairs without length limit, and (3) Smoothing as learning objective using a novel Maximum-A-Posteriori version of Deleted Estimation working with Expectation-Maximization. Where others conclude that latent segmentations lead to overfitting and deteriorating performance, we show here that these three ingredients give performance equivalent to the heuristic method on reasonably sized training data.