Class-based n-gram models of natural language
Computational Linguistics
An efficient method for determining bilingual word classes
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A simple and effective hierarchical phrase reordering model
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A unigram orientation model for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Automatic tagging of Arabic text: from raw text to base phrase chunks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
CCG supertags in factored statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Mixture-model adaptation for SMT
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Improved language modeling for statistical machine translation
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Better hypothesis testing for statistical machine translation: controlling for optimizer instability
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Topic adaptation for lecture translation through bilingual latent semantic models
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
The Uppsala-FBK systems at WMT 2011
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Statistical machine translation with local language models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
In this paper, we address statistical machine translation of public conference talks. Modeling the style of this genre can be very challenging given the shortage of available in-domain training data. We investigate the use of a hybrid LM, where infrequent words are mapped into classes. Hybrid LMs are used to complement word-based LMs with statistics about the language style of the talks. Extensive experiments comparing different settings of the hybrid LM are reported on publicly available benchmarks based on TED talks, from Arabic to English and from English to French. The proposed models show to better exploit in-domain data than conventional word-based LMs for the target language modeling component of a phrase-based statistical machine translation system.