Query transitive translation using IR score for indonesian-japanese CLIR

  • Authors:
  • Ayu Purwarianti;Masatoshi Tsuchiya;Seiichi Nakagawa

  • Affiliations:
  • Department of Information and Computer Science, Toyohashi University of Technology;Department of Information and Computer Science, Toyohashi University of Technology;Department of Information and Computer Science, Toyohashi University of Technology

  • Venue:
  • AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
  • Year:
  • 2005

Quantified Score

Hi-index 0.02

Visualization

Abstract

We combined the mutual information score and TF × IDF score (IR score) in order to select the best keyword translation in our transitive translation. The transitive translation used bilingual dictionaries to translate Indonesian query into Japanese keywords. The Japanese keywords are then used as the input to retrieve Japanese documents. The keyword selection is done in two steps. The first step is to sort translation candidates according to their mutual information scores calculated from a monolingual target language corpus. The second step is to select the best candidate set among 5 top mutual information scores based on their TF × IDF scores. The experiment against NTCIR-3 Web Retrieval Task data shows that the keyword selection based on this combination achieved higher IR score than a direct translation method using original Indonesian-Japanese dictionary and also higher than the machine translation result using Kataku (Indonesian-English) and Babelfish (English-Japanese) engines.