On-line language model biasing for statistical machine translation

Authors:
Sankaranarayanan Ananthakrishnan;Rohit Prasad;Prem Natarajan
Affiliations:
Raytheon BBN Technologies, Cambridge, MA;Raytheon BBN Technologies, Cambridge, MA;Raytheon BBN Technologies, Cambridge, MA
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Year:
2011

Citing 8
Cited 2

The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Language model adaptation for automatic speech recognition and statistical machine translation

Language model adaptation for automatic speech recognition and statistical machine translation
Language model adaptation for statistical machine translation with structured query models

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Language and translation model adaptation using comparable corpora

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Translation model based cross-lingual language model adaptation: from word models to phrase models

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joint and coupled bilingual topic model based sentence representations for language model adaptation

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The language model (LM) is a critical component in most statistical machine translation (SMT) systems, serving to establish a probability distribution over the hypothesis space. Most SMT systems use a static LM, independent of the source language input. While previous work has shown that adapting LMs based on the input improves SMT performance, none of the techniques has thus far been shown to be feasible for on-line systems. In this paper, we develop a novel measure of cross-lingual similarity for biasing the LM based on the test input. We also illustrate an efficient on-line implementation that supports integration with on-line SMT systems by transferring much of the computational load off-line. Our approach yields significant reductions in target perplexity compared to the static LM, as well as consistent improvements in SMT performance across language pairs (English-Dari and English-Pashto).