Effective use of linguistic and contextual information for statistical machine translation

Authors:
Libin Shen;Jinxi Xu;Bing Zhang;Spyros Matsoukas;Ralph Weischedel
Affiliations:
BBN Technologies, Cambridge, MA;BBN Technologies, Cambridge, MA;BBN Technologies, Cambridge, MA;BBN Technologies, Cambridge, MA;BBN Technologies, Cambridge, MA
Venue:
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Year:
2009

Citing 11
Cited 17

Integration of diverse recognition methodologies through reevaluation of N-best sentence hypotheses

HLT '91 Proceedings of the workshop on Speech and Natural Language
Ultraconservative online algorithms for multiclass problems

The Journal of Machine Learning Research
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative global training algorithm for statistical MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Improving statistical machine translation using lexicalized rule selection

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Triplet lexicon models for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

A joint rule selection model for hierarchical phrase-based translation

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Better filtration and augmentation for hierarchical phrase-based translation rules

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improved translation with source syntax labels

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Maximum entropy based phrase reordering for hierarchical phrase-based translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Statistical machine translation with a factorized grammar

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Hierarchical phrase-based machine translation with word-based reordering model

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Constituent reordering and syntax models for English-to-Japanese statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Exploiting syntactic relationships in a phrase-based decoder: an exploration

Machine Translation
String-to-dependency statistical machine translation

Computational Linguistics
Cunei: open-source machine translation with relevance-based models of each translation instance

Machine Translation
Integrating source-language context into phrase-based statistical machine translation

Machine Translation
Statistical machine translation with local language models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Constraint optimization approach to context based word selection

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A topic similarity model for hierarchical phrase-based translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Head-driven hierarchical phrase-based translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Using syntactic head information in hierarchical phrase-based translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Statistical machine translation enhancements through linguistic levels: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and apply them in a state-of-the-art hierarchical MT system. The features used in this work are non-terminal labels, non-terminal length distribution, source string context and source dependency LM scores. The effectiveness of our techniques is demonstrated by significant improvements over a strong base-line. On Arabic-to-English translation, improvements in lower-cased BLEU are 2.0 on NIST MT06 and 1.7 on MT08 newswire data on decoding output. On Chinese-to-English translation, the improvements are 1.0 on MT06 and 0.8 on MT08 newswire data.