Experiments in multilingual information retrieval using the SPIDER system
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Similarity-based methods for word sense disambiguation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Identifying word translations in non-parallel texts
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Looking for candidate translational equivalents in specialized, comparable corpora
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
Mining comparable bilingual text corpora for cross-language information integration
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Creating and exploiting a comparable corpus in cross-language information retrieval
ACM Transactions on Information Systems (TOIS)
A geometric view on bilingual lexicon extraction from comparable corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Statistical Language Models for Information Retrieval A Critical Review
Foundations and Trends in Information Retrieval
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
On the use of comparable corpora to improve SMT performance
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Cross-Language Information Retrieval
Cross-Language Information Retrieval
Improving corpus comparability for bilingual lexicon extraction from comparable corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Clustering comparable corpora for bilingual lexicon extraction
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Topic based creation of a persian-english comparable corpus
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Detecting highly confident word translations from comparable corpora without any prior knowledge
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
A main challenge in Cross-Language information retrieval is to estimate a translation language model, as its quality directly affects the retrieval performance. The translation language model is built using translation resources such as bilingual dictionaries, parallel corpora, or comparable corpora. In general, high quality resources may not be available for scarce-resource languages. For these languages, efficient exploitation of commonly available resources such as comparable corpora is considered more crucial. In this paper, we focus on using only comparable corpora to extract translation information more efficiently. We propose a language modeling approach for estimating the translation language model. The proposed method is based on probability distribution estimation, and can be tuned easier in comparison with heuristically adjusted previous work. Experiment results show a significant improvement in the translation quality and CLIR performance compared to the previous approaches.