A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Applying query structuring in cross-language retrieval
Information Processing and Management: an International Journal
The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval
HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 4 - Volume 4
Cross-Language Evaluation Forum: Objectives, Results, Achievements
Information Retrieval
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-lingual information retrieval using hidden Markov models
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Combining bidirectional translation and synonymy for cross-language information retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of the bible as a resource for cross-language information retrieval
MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Matching meaning for cross-language information retrieval
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper explores corpus-based bilingual retrieval where the translation corpora used vary by source and size. We find that the quality of translation alignments and the domain of the bitext are important. In some settings these factors are more critical than corpus size. We also show that judicious choice of tokenization can reduce the amount of bitext required to obtain good bilingual retrieval performance.