Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Alignment Template Approach to Statistical Machine Translation
Computational Linguistics
Scaling phrase-based statistical machine translation to larger corpora and longer phrases
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Left-to-right target generation for hierarchical phrase-based translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Empirical lower bounds on the complexity of translational equivalence
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Translating with non-contiguous phrases
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
N-gram-based Machine Translation
Computational Linguistics
Hierarchical Phrase-Based Translation
Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Tera-scale translation models via pattern matching
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A unigram orientation model for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
11,001 new features for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Empirical lower bounds on alignment error rates in syntax-based machine translation
SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Joshua: an open source toolkit for parsing-based machine translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Machine translation as lexicalized parsing with hooks
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
A joint sequence translation model with integrated reordering
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Gappy phrasal alignment by agreement
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Generative models of monolingual and bilingual gappy patterns
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Quasi-synchronous phrase dependency grammars for machine translation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
On hierarchical re-ordering and permutation parsing for phrase-based decoding
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Hi-index | 0.00 |
A principal weakness of conventional (i.e., non-hierarchical) phrase-based statistical machine translation is that it can only exploit continuous phrases. In this paper, we extend phrase-based decoding to allow both source and target phrasal discontinuities, which provide better generalization on unseen data and yield significant improvements to a standard phrase-based system (Moses). More interestingly, our discontinuous phrase-based system also outperforms a state-of-the-art hierarchical system (Joshua) by a very significant margin (+1.03 BLEU on average on five Chinese-English NIST test sets), even though both Joshua and our system support discontinuous phrases. Since the key difference between these two systems is that ours is not hierarchical---i.e., our system uses a string-based decoder instead of CKY, and it imposes no hard hierarchical reordering constraints during training and decoding---this paper sets out to challenge the commonly held belief that the tree-based parameterization of systems such as Hiero and Joshua is crucial to their good performance against Moses.