Learning probabilistic synchronous CFGs for phrase-based translation

Authors:
Markos Mylonakis;Khalil Sima'an
Affiliations:
University of Amsterdam;University of Amsterdam
Venue:
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Year:
2010

Citing 16
Cited 3

Bias/variance decompositions for likelihood-based estimators

Neural Computation
A systematic comparison of various statistical alignment models

Computational Linguistics
Data-Oriented Parsing

Data-Oriented Parsing
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Sampling alignment structure under a Bayesian translation model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Phrase translation probabilities with ITG priors and smoothing as learning objective

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Joshua: an open source toolkit for parsing-based machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Why generative phrase models underperform surface heuristics

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
A Gibbs sampler for phrasal synchronous grammar induction

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

Learning hierarchical translation structure with linguistic annotations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Leave-one-out phrase model training for large-scale deployment

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probabilistic phrase-based synchronous grammars are now considered promising devices for statistical machine translation because they can express reordering phenomena between pairs of languages. Learning these hierarchical, probabilistic devices from parallel corpora constitutes a major challenge, because of multiple latent model variables as well as the risk of data overfitting. This paper presents an effective method for learning a family of particular interest to MT, binary Synchronous Context-Free Grammars with inverted/monotone orientation (a.k.a. Binary ITG). A second contribution concerns devising a lexicalized phrase reordering mechanism that has complimentary strengths to Chiang's model. The latter conditions reordering decisions on the surrounding lexical context of phrases, whereas our mechanism works with the lexical content of phrase pairs (akin to standard phrase-based systems). Surprisingly, our experiments on French-English data show that our learning method applied to far simpler models exhibits performance indistinguishable from the Hiero system.