Cross-language text retrieval by query translation using term re-weighting

  • Authors:
  • Insu Kang;Oh-Woog Kwon;Jong-Hyeok Lee;Geunbae Lee

  • Affiliations:
  • Dept. of Computer Science & Engineering, POSTECH (Pohang University of Science & Technology), KOREA;Dept. of Computer Science & Engineering, POSTECH (Pohang University of Science & Technology), KOREA;Dept. of Computer Science & Engineering, POSTECH (Pohang University of Science & Technology), KOREA;Dept. of Computer Science & Engineering, POSTECH (Pohang University of Science & Technology), KOREA

  • Venue:
  • Multimodal interface for human-machine communication
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a dictionary-based query translation for cross-language text retrieval, transfer ambiguity is one of main causes of performance deterioration, but this problem has not received significant attention in this field. To resolve transfer ambiguity, this paper proposes a two-phase query translation based on term re-weighting, which uses a bilingual transfer dictionary, originally designed for machine translation. In general, source language query terms each show some word association with others, so that their correct translations should be more likely to co-occur in target documents. Based on this simple intuition, the first phase discriminates more relevant target documents from the others. Using statistical and ranking information from the highly relevant documents, the second phase then converts a translated query vector into re-weighted form to add an extra weight on probably correct target terms. In experiments, results were remarkable: the proposed method achieved almost the same performance as the monolingual IR system, actually contributing to an improvement of precision by about 9% over a baseline system.