Coarse-to-fine syntactic machine translation using language projections

Authors:
Slav Petrov;Aria Haghighi;Dan Klein
Affiliations:
University of California at Berkeley, Berkeley, CA;University of California at Berkeley, Berkeley, CA;University of California at Berkeley, Berkeley, CA
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 12
Cited 12

Class-based n-gram models of natural language

Computational Linguistics
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
A polynomial-time algorithm for statistical machine translation

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A parsing: fast exact Viterbi parse selection

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Statistical machine translation by parsing

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multilevel coarse-to-fine PCFG parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Efficient parsing for transducer grammars

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improved smoothing for N-gram language models based on ordinary counts

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Quadratic-time dependency parsing for machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Cube pruning as heuristic search

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Consensus training for consensus decoding in machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Simple, accurate parsing with an all-fragments grammar

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Exact decoding of syntactic translation models through Lagrangian relaxation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

ACM Transactions on Asian Language Information Processing (TALIP)
Cardinality pruning and language model heuristics for hierarchical phrase-based translation

Machine Translation
Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The intersection of tree transducer-based translation models with n-gram language models results in huge dynamic programs for machine translation decoding. We propose a multipass, coarse-to-fine approach in which the language model complexity is incrementally introduced. In contrast to previous order-based bigram-to-trigram approaches, we focus on encoding-based methods, which use a clustered encoding of the target language. Across various encoding schemes, and for multiple language pairs, we show speed-ups of up to 50 times over single-pass decoding while improving BLEU score. Moreover, our entire decoding cascade for trigram language models is faster than the corresponding bigram pass alone of a bigram-to-trigram decoder.