Gappy phrasal alignment by agreement

Authors:
Mohit Bansal;Chris Quirk;Robert C. Moore
Affiliations:
UC Berkeley;Microsoft Research;Google Research
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 14
Cited 1

A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A phrase-based, joint probability model for statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
HMM word and phrase alignment for statistical machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Measuring Word Alignment Quality for Statistical Machine Translation

Computational Linguistics
Why generative phrase models underperform surface heuristics

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Accurate non-hierarchical phrase-based translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Pseudo-word for phrase-based machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Generative models of monolingual and bilingual gappy patterns

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a principled and efficient phrase-to-phrase alignment model, useful in machine translation as well as other related natural language processing problems. In a hidden semi-Markov model, word-to-phrase and phrase-to-word translations are modeled directly by the system. Agreement between two directional models encourages the selection of parsimonious phrasal alignments, avoiding the overfitting commonly encountered in unsupervised training with multi-word units. Expanding the state space to include "gappy phrases" (such as French ne * pas) makes the alignment space more symmetric; thus, it allows agreement between discontinuous alignments. The resulting system shows substantial improvements in both alignment quality and translation quality over word-based Hidden Markov Models, while maintaining asymptotically equivalent runtime.