Multilingual Information Retrieval Based on Parallel Texts from the Web

  • Authors:
  • Jian-Yun Nie;Michel Simard;George Foster

  • Affiliations:
  • -;-;-

  • Venue:
  • CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe our approach in CLEF Cross-Language IR (CLIR) tasks. In our experiments, we used statistical translation models for query translation. Some of the models are trained on parallel web pages that are automatically mined from the Web. Others are trained from bilingual dictionaries and lexical databases. These models are combined in query translation. Our goal in this series of experiments is to test if the parallel web pages can be used effectively to translate queries in multilingual IR. In particular, we compare models trained on Web documents with models that also combine other resources such as dictionaries. Our results show that the models trained on the parallel web pages can achieve reasonable CLIR performance. However, combining models effectively is a difficult task, and single models still yield better results.