Test Collection-Based IR Evaluation Needs Extension toward Sessions --- A Case of Extremely Short Queries

  • Authors:
  • Heikki Keskustalo;Kalervo Järvelin;Ari Pirkola;Tarun Sharma;Marianne Lykke

  • Affiliations:
  • University of Tampere, Finland;University of Tampere, Finland;University of Tampere, Finland;University of Tampere, Finland;Royal School of Library and Information Science, Denmark

  • Venue:
  • AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is overwhelming evidence suggesting that the real users of IR systems often prefer using extremely short queries (one or two individual words) but they try out several queries if needed. Such behavior is fundamentally different from the process modeled in the traditional test collection-based IR evaluation based on using more verbose queries and only one query per topic. In the present paper, we propose an extension to the test collection-based evaluation. We will utilize sequences of short queries based on empirically grounded but idealized session strategies. We employ TREC data and have test persons to suggest search words, while simulating sessions based on the idealized strategies for repeatability and control. The experimental results show that, surprisingly, web-like very short queries (including one-word query sequences) typically lead to good enough results even in a TREC type test collection. This finding motivates the observed real user behavior: as few very simple attempts normally lead to good enough results, there is no need to pay more effort. We conclude by discussing the consequences of our finding for IR evaluation.