Monolingual, bilingual, and GIRT information retrieval at CLEF-2005

  • Authors:
  • Jacques Savoy;Pierre-Yves Berger

  • Affiliations:
  • Institut interfacultaire d’informatique, Université de Neuchâtel, Neuchâtel, Switzerland;Institut interfacultaire d’informatique, Université de Neuchâtel, Neuchâtel, Switzerland

  • Venue:
  • CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

For our fifth participation in the CLEF evaluation campaigns, our first objective was to propose an effective and general stopword list as well as a light stemming procedure for the Hungarian, Bulgarian and Portuguese (Brazilian) languages. Our second objective was to obtain a better picture of the relative merit of various search engines when processing documents in those languages. To do so we evaluated our scheme using two probabilistic models and five vector-processing approaches. In the bilingual track, we evaluated both the machine translation and bilingual dictionary approaches applied to automatically translate a query submitted in English into various target languages. Finally, using the GIRT corpora (available in English, German and Russian), we investigated the variations in retrieval effectiveness that resulted when we included or excluded manually assigned keywords attached to the bibliographic records (mainly comprising a title and an abstract).