Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A program for aligning sentences in bilingual corpora
Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Cross-Language Evaluation Forum: Objectives, Results, Achievements
Information Retrieval
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval
Information Retrieval
Character N-Gram Tokenization for European Language Text Retrieval
Information Retrieval
Automatic association of web directories with word senses
Computational Linguistics - Special issue on web as corpus
Technical issues of cross-language information retrieval: a review
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Improving query translation with confidence estimation for cross language information retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Combining resources with confidence measures for cross language information retrieval
Proceedings of the ACM first Ph.D. workshop in CIKM
Hi-index | 0.00 |
In this paper, we describe our approach in CLEF Cross-Language IR (CLIR) tasks. In our experiments, we used statistical translation models for query translation. Some of the models are trained on parallel web pages that are automatically mined from the Web. Others are trained from bilingual dictionaries and lexical databases. These models are combined in query translation. Our goal in this series of experiments is to test if the parallel web pages can be used effectively to translate queries in multilingual IR. In particular, we compare models trained on Web documents with models that also combine other resources such as dictionaries. Our results show that the models trained on the parallel web pages can achieve reasonable CLIR performance. However, combining models effectively is a difficult task, and single models still yield better results.