Query transitive translation using IR score for indonesian-japanese CLIR

Authors:
Ayu Purwarianti;Masatoshi Tsuchiya;Seiichi Nakagawa
Affiliations:
Department of Information and Computer Science, Toyohashi University of Technology;Department of Information and Computer Science, Toyohashi University of Technology;Department of Information and Computer Science, Toyohashi University of Technology
Venue:
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Year:
2005

Citing 5
Cited 0

Resolving ambiguity for cross-language retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A stop list for general text

ACM SIGIR Forum
Improving cross language retrieval with triangulated translation

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Improving query translation for cross-language information retrieval using statistical models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval

Quantified Score

Hi-index	0.02

Visualization

Abstract

We combined the mutual information score and TF × IDF score (IR score) in order to select the best keyword translation in our transitive translation. The transitive translation used bilingual dictionaries to translate Indonesian query into Japanese keywords. The Japanese keywords are then used as the input to retrieve Japanese documents. The keyword selection is done in two steps. The first step is to sort translation candidates according to their mutual information scores calculated from a monolingual target language corpus. The second step is to select the best candidate set among 5 top mutual information scores based on their TF × IDF scores. The experiment against NTCIR-3 Web Retrieval Task data shows that the keyword selection based on this combination achieved higher IR score than a direct translation method using original Indonesian-Japanese dictionary and also higher than the machine translation result using Kataku (Indonesian-English) and Babelfish (English-Japanese) engines.