Efficient parsing for transducer grammars

Authors:
John DeNero;Mohit Bansal;Adam Pauls;Dan Klein
Affiliations:
University of California, Berkeley;University of California, Berkeley;University of California, Berkeley;University of California, Berkeley
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 12
Cited 14

Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Semiring parsing

Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Left-to-right target generation for hierarchical phrase-based translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Synchronous binarization for machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
SPMT: statistical machine translation with syntactified target language phrases

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Coarse-to-fine syntactic machine translation using language projections

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Better binarization for the CKY parsing

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Asynchronous binarization for synchronous grammars

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Weight pushing and binarization for fixed-grammar parsing

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Cube pruning as heuristic search

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Unsupervised syntactic alignment with inversion transduction grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Type-based MCMC

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Why synchronous tree substitution grammars?

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Discriminative pruning for discriminative ITG alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
SCFG decoding without binarization

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
An alternative to synchronous tree substitution grammars*

Natural Language Engineering
Terminal-aware synchronous binarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Tree parsing with synchronous tree-adjoining grammars

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Using categorial grammar to label translation rules

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Finding the smallest binarization of a CFG is NP-hard

Journal of Computer and System Sciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

The tree-transducer grammars that arise in current syntactic machine translation systems are large, flat, and highly lexicalized. We address the problem of parsing efficiently with such grammars in three ways. First, we present a pair of grammar transformations that admit an efficient cubic-time CKY-style parsing algorithm despite leaving most of the grammar in n-ary form. Second, we show how the number of intermediate symbols generated by this transformation can be substantially reduced through binarization choices. Finally, we describe a two-pass coarse-to-fine parsing approach that prunes the search space using predictions from a subset of the original grammar. In all, parsing time reduces by 81%. We also describe a coarse-to-fine pruning scheme for forest-based language model reranking that allows a 100-fold increase in beam size while reducing decoding time. The resulting translations improve by 1.3 BLEU.