Application of loglinear models to informetric phenomena
Information Processing and Management: an International Journal - Special issue on Informetrics
Inferring probability of relevance using the method of logistic regression
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical inference in retrieval effectiveness evaluation
Information Processing and Management: an International Journal
Support for interactive document selection in cross-language information retrieval
Information Processing and Management: an International Journal - Special issue on progress toward digital libraries
Experimentation as a way of life: Okapi at TREC
Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
Database merging strategy based on logistic regression
Information Processing and Management: an International Journal
Things a Computer Scientist Rarely Talks About
Things a Computer Scientist Rarely Talks About
Cross-Language Information Retrieval
Cross-Language Information Retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments with the Eurospider Retrieval System for CLEF 2001
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval
Information Retrieval
Character N-Gram Tokenization for European Language Text Retrieval
Information Retrieval
Combination Approaches for Multilingual Text Retrieval
Information Retrieval
How Effective is Stemming and Decompounding for German Text Retrieval?
Information Retrieval
Journal of the American Society for Information Science and Technology
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Comparative study of monolingual and multilingual search models for use with asian languages
ACM Transactions on Asian Language Information Processing (TALIP)
ACM SIGIR Forum
A study of statistical models for query translation: finding a good unit of translation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005,Vienna, Austria, 21-23 September, 2005, ... Papers (Lecture Notes in Computer Science)
Modern Applied Statistics with S
Modern Applied Statistics with S
Comparative evaluation of cross-language information retrieval systems
From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments
Hi-index | 0.00 |
In order to search corpora written in two or more languages, the simplest and most efficient approach is to translate the query submitted into the required language(s). To achieve this goal, we developed an IR model based on translation tools freely available on the Web (bilingual machine-readable dictionaries, machine translation systems). When comparing the retrieval effectiveness of manually and automatically translated queries, we found that manual translation outperformed machine-based approaches, yet performance differences varied from one language to the text. Moreover, when analyzing query-by-query performances, we found that query performances based on machine-based translations varied a great deal. We then wondered whether or not we could predict the retrieval performance of a translated query and apply this knowledge to select the best translation(s). To do so we designed and evaluated a predictive system based on logistic regression and then used it to select the top most appropriate machine-based translations. Using a set of 99 queries and a documents collection available in the German and Spanish languages (extracted from the CLEF-2001 and 2002 test suites), we show that the retrieval performance of the suggested query translation selection procedure is statistically better than the single best MT system, but still inferior to the retrieval performances resulting from manual translations.