Evaluating a probabilistic model for cross-lingual information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Computational Linguistics - Special issue on web as corpus
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Dictionary-based techniques for cross-language information retrieval
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Language model adaptation for automatic speech recognition and statistical machine translation
Language model adaptation for automatic speech recognition and statistical machine translation
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Cross-lingual lexical triggers in statistical language modeling
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Language model adaptation for statistical machine translation with structured query models
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A simple sentence-level extraction algorithm for comparable data
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
A beam-search extraction algorithm for comparable data
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Exploiting comparable corpora with TER and TERp
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Discriminative corpus weight estimation for machine translation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Translation model adaptation by resampling
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Domain adaptation for machine translation by mining unseen words
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
On-line language model biasing for statistical machine translation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Two ways to use a noisy parallel news corpus for improving statistical machine translation
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
ACM Transactions on Asian Language Information Processing (TALIP)
Parallel sentence generation from comparable corpora for improved SMT
Machine Translation
A minimally supervised approach for detecting and ranking document translation pairs
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Structural and topical dimensions in multi-task patent translation
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Journal of Artificial Intelligence Research
Topic models for dynamic translation model adaptation
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Translation model based cross-lingual language model adaptation: from word models to phrase models
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Twitter translation using translation-based cross-lingual retrieval
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Joint and coupled bilingual topic model based sentence representations for language model adaptation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to generate, monolingual data are relatively common. Yet monolingual data have been under-utilized, having been used primarily for training a language model in the target language. This paper describes a novel method for utilizing monolingual target data to improve the performance of a statistical machine translation system on news stories. The method exploits the existence of comparable text---multiple texts in the target language that discuss the same or similar stories as found in the source language document. For every source document that is to be translated, a large monolingual data set in the target language is searched for documents that might be comparable to the source documents. These documents are then used to adapt the MT system to increase the probability of generating texts that resemble the comparable document. Experimental results obtained by adapting both the language and translation models show substantial gains over the baseline system.