Word association norms, mutual information, and lexicography
Computational Linguistics
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
A survey of multilingual text retrieval
A survey of multilingual text retrieval
Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasal translation and query expansion techniques for cross-language information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
In a dictionary-based query translation for cross-language text retrieval, transfer ambiguity is one of main causes of performance deterioration, but this problem has not received significant attention in this field. To resolve transfer ambiguity, this paper proposes a two-phase query translation based on term re-weighting, which uses a bilingual transfer dictionary, originally designed for machine translation. In general, source language query terms each show some word association with others, so that their correct translations should be more likely to co-occur in target documents. Based on this simple intuition, the first phase discriminates more relevant target documents from the others. Using statistical and ranking information from the highly relevant documents, the second phase then converts a translated query vector into re-weighted form to add an extra weight on probably correct target terms. In experiments, results were remarkable: the proposed method achieved almost the same performance as the monolingual IR system, actually contributing to an improvement of precision by about 9% over a baseline system.