Learning translation boundaries for phrase-based decoding

Authors:
Deyi Xiong;Min Zhang;Haizhou Li
Affiliations:
Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 17
Cited 6

A maximum entropy approach to natural language processing

Computational Linguistics
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Decoding complexity in word-replacement translation models

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Dividing and conquering long sentences in a translation system

HLT '91 Proceedings of the workshop on Speech and Natural Language
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Maximum entropy based phrase reordering model for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Reordering constraints for phrase-based statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Classifying chart cells for quadratic complexity context-free inference

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Extracting synchronous grammar rules from word-level alignments in linear time

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A syntax-driven bracketing model for phrase-based translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Parsing the penn chinese treebank with semantic knowledge

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Learning phrase boundaries for hierarchical phrase-based translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Learning to transform and select elementary trees for improved syntax-based machine translations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A statistical tree annotator and its applications

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Reordering constraint based on document-level context

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Integrating source-language context into phrase-based statistical machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Constrained decoding is of great importance not only for speed but also for translation quality. Previous efforts explore soft syntactic constraints which are based on constituent boundaries deduced from parse trees of the source language. We present a new framework to establish soft constraints based on a more natural alternative: translation boundary rather than constituent boundary. We propose simple classifiers to learn translation boundaries for any source sentences. The classifiers are trained directly on word-aligned corpus without using any additional resources. We report the accuracy of our translation boundary classifiers. We show that using constraints based on translation boundaries predicted by our classifiers achieves significant improvements over the baseline on large-scale Chinese-to-English translation experiments. The new constraints also significantly outperform constituent boundary based syntactic constrains.