Wider context by using bilingual language models in machine translation

Authors:
Jan Niehues;Teresa Herrmann;Stephan Vogel;Alex Waibel
Affiliations:
Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany;Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany;Carnegie Mellon University;Institute for Anthropomatics, KIT - Karlsruhe Institute of Technology, Germany, and Carnegie Mellon University
Venue:
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Year:
2011

Citing 12
Cited 6

Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Factored language models and generalized parallel backoff

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Machine Translation with Inferred Stochastic Finite-State Transducers

Computational Linguistics
N-gram-based Machine Translation

Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Triplet lexicon models for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discriminative word alignment via alignment matrix modeling

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
The Universität Karlsruhe translation system for the EACL-WMT 2009

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
LIMSI's statistical translation systems for WMT'10

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
EMDC: a semi-supervised approach for word alignment

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Factored bilingual n-gram language models for statistical machine translation

Machine Translation
The Karlsruhe Institute of Technology translation systems for the WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation

The Karlsruhe Institute of Technology translation systems for the WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Continuous space translation models with neural networks

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Generalized biwords for bitext compression and translation spotting

Journal of Artificial Intelligence Research
Joint WMT 2012 submission of the QUAERO project

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The Karlsruhe institute of technology translation systems for the WMT 2012

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Generalized biwords for bitext compression and translation spotting: extended abstract

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In past Evaluations for Machine Translation of European Languages, it could be shown that the translation performance of SMT systems can be increased by integrating a bilingual language model into a phrase-based SMT system. In the bilingual language model, target words with their aligned source words build the tokens of an n-gram based language model. We analyzed the effect of bilingual language models and show where they could help to better model the translation process. We could show improvements of translation quality on German-to-English and Arabic-to-English. In addition, for the Arabic-to-English task, training an extra bilingual language model on the POS tags instead of the surface word forms led to further improvements.