Classifying and filtering blind feedback terms to improve information retrieval effectiveness

  • Authors:
  • Johannes Leveling;Gareth J. F. Jones

  • Affiliations:
  • Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland

  • Venue:
  • RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The classification of blind relevance feedback (BRF) terms described in this paper aims at increasing precision or recall by determining which terms decrease, increase or do not change the corresponding information retrieval (IR) performance metric. Classification and IR experiments are performed on the German and English GIRT data, using the BM25 retrieval model. Several basic memory-based classifiers are trained on different feature sets, grouping together features from different query expansion (QE) approaches. Combined classifiers employ the results of the basic classifiers and correctness predictions as features. The best combined classifiers for German (English) yield 22.9% (26.4%) and 5.8% (1.9%) improvement for term classification wrt. precision and recall compared to the best basic classifiers. IR experiments based on this term classification have also been performed. Filtering out different types of BRF terms shows that selecting feedback terms predicted to increase precision improves the average precision significantly compared to experiments without BRF. MAP is improved by +19.8% compared to the best standard BRF experiment (+11% for German). BRF term classification also increases the number of relevant and retrieved documents, geometric MAP, and P@10 in comparison to standard BRF. Experiments based on an optimal classification show that there is potential for improving IR effectiveness even more.