Statistical Models for Monolingual and Bilingual Information Retrieval

  • Authors:
  • Nicola Bertoldi;Marcello Federico

  • Affiliations:
  • ITC-irst, Centro per la Ricerca Scientifica e Tecnologica, I-38050, Povo, Italy. bertoldi@itc.it;ITC-irst, Centro per la Ricerca Scientifica e Tecnologica, I-38050, Povo, Italy. federico@itc.it

  • Venue:
  • Information Retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work reviews information retrieval systems developed at ITC-irst which were evaluated through several tracks of CLEF, during the last three years. The presentation tries to follow the progress made over time in developing new statistical models first for monolingual information retrieval, then for cross-language information retrieval. Besides describing the underlying theory, performance of monolingual and bilingual information retrieval models are reported, respectively, on Italian monolingual tracks and Italian-English bilingual tracks of CLEF. Monolingual systems by ITC-irst performed consistently well in all the official evaluations, while the bilingual system ranked in CLEF 2002 just behind competitors using commercial machine translation engines. However, by experimentally comparing our statistical topic translation model against a state-of-the-art commercial system, no statistically significant difference in retrieval performance could be measured on a larger set of queries.