Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating a probabilistic model for cross-lingual information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Linear discriminant model for information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Language model adaptation for automatic speech recognition and statistical machine translation
Language model adaptation for automatic speech recognition and statistical machine translation
The Alignment Template Approach to Statistical Machine Translation
Computational Linguistics
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Language model adaptation for statistical machine translation with structured query models
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hierarchical Phrase-Based Translation
Computational Linguistics
Retrieval models for question and answer archives
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Bilingual LSA-based adaptation for statistical machine translation
Machine Translation
Resampling auxiliary data for language model adaptation in machine translation for speech
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Language and translation model adaptation using comparable corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mixture-model adaptation for SMT
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Intelligent selection of language model training data
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Cross lingual adaptation: an experiment on sentiment classifications
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Clickthrough-based translation models for web search: from word models to phrase models
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Phrase-based translation model for question retrieval in community question answer archives
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
On-line language model biasing for statistical machine translation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Domain adaptation via pseudo in-domain data selection
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Joint and coupled bilingual topic model based sentence representations for language model adaptation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
In this paper, we propose a novel translation model (TM) based cross-lingual data selection model for language model (LM) adaptation in statistical machine translation (SMT), from word models to phrase models. Given a source sentence in the translation task, this model directly estimates the probability that a sentence in the target LM training corpus is similar. Compared with the traditional approaches which utilize the first pass translation hypotheses, cross-lingual data selection model avoids the problem of noisy proliferation. Furthermore, phrase TM based cross-lingual data selection model is more effective than the traditional approaches based on bag-of-words models and word-based TM, because it captures contextual information in modeling the selection of phrase as a whole. Experiments conducted on large-scale data sets demonstrate that our approach significantly outperforms the state-of-the-art approaches on both LM perplexity and SMT performance.