Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Hi-index | 0.00 |
This paper describes our work at CLEF 2006 Robust task. This is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have participated in all subtasks: monolingual (English, French, Italian and Spanish), bilingual (Italian to Spanish) and multilingual (Spanish to [English, French, Italian and Spanish]). In monolingual retrieval we have focused our effort on local query expansion, i.e. using only the information from retrieved documents, not from the complete document collection or external corpora, such as the Web. Some local expansion techniques were applied for training topics. Regarding robustness the most effective one was the use of co-occurrence based thesauri, which were constructed using co-occurrence relations in windows of terms, not in complete documents. This is an effective technique that can be easily implemented by tuning only a few parameters. In bilingual and multilingual retrieval experiments several machine translation programs were used to translate topics. For each target language, translations were merged before performing a monolingual retrieval. We also applied the same local expansion technique. In multilingual retrieval, weighted max-min normalization was used to merge lists. In all the subtasks in which we participated our mandatory runs (using title and description fields of the topics) obtained very good rankings. Runs with short queries (only title field) also obtained high MAP and GMAP values using the same expansion technique.