Statistical cross-language information retrieval using n-best query translations

  • Authors:
  • Marcello Federico;Nicola Bertoldi

  • Affiliations:
  • ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy;ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy

  • Venue:
  • SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel statistical model for cross-language information retrieval. Given a written query in the source language, documents in the target language are ranked by integrating probabilities computed by two statistical models: a query-translation model, which generates most probable term-by-term translations of the query, and a query-document model, which evaluates the likelihood of each document and translation. Integration of the two scores is performed over the set of N most probable translations of the query. Experimental results with values N=1, 5, 10 are presented on the Italian-English bilingual track data used in the CLEF 2000 and 2001 evaluation campaigns.