Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Embedding web-based statistical translation models in cross-language information retrieval
Computational Linguistics - Special issue on web as corpus
A program for aligning sentences in bilingual corpora
Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Mining the Web for bilingual text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic filtering of bilingual corpora for statistical machine translation
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Comparing different units for query translation in Chinese cross-language information retrieval
Proceedings of the 2nd international conference on Scalable information systems
Hi-index | 0.00 |
Noisy parallel corpora have been widely used for Cross-language information retrieval (CLIR). However, the previous studies only focus on truly parallel corpus. In this paper, we examine two possible approaches to exploit noisy corpora: filtering out noise from the corpora or adapting the training process of translation model to the noise corpora. Our experiments show that the second approach is better suited to CLIR.