Discounted cumulated gain based evaluation of multiple-query IR sessions

Authors:
Kalervo Järvelin;Susan L. Price;Lois M. L. Delcambre;Marianne Lykke Nielsen
Affiliations:
University of Tampere, Finland;Portland State University;Portland State University;Royal School of Library and Information Science, Denmark
Venue:
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Year:
2008

Citing 7
Cited 16

Evaluating interactive systems in TREC

Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by highly relevant documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)

The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)
Semantic components enhance retrieval of domain-specific documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Binary and graded relevance in IR evaluations-Comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal

Some(what) grand challenges for information retrieval

ACM SIGIR Forum
A comparison of query and term suggestion features for interactive searching

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

Information Systems
Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

Information Systems
Modeling Expected Utility of Multi-session Information Distillation

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Explaining User Performance in Information Retrieval: Challenges to IR Evaluation

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Methods for Evaluating Interactive Information Retrieval Systems with Users

Foundations and Trends in Information Retrieval
Test Collection-Based IR Evaluation Needs Extension toward Sessions --- A Case of Extremely Short Queries

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Beyond DCG: user behavior as a predictor of a successful search

Proceedings of the third ACM international conference on Web search and data mining
Learning to rank relevant and novel documents through user feedback

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Evaluating multi-query sessions

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Advances on the development of evaluation measures

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Contextual evaluation of query reformulations in a search session by user simulation

Proceedings of the 21st ACM international conference on Information and knowledge management
Toward self-correcting search engines: using underperforming queries to improve search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Toward whole-session relevance: exploring intrinsic diversity in web search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

IR research has a strong tradition of laboratory evaluation of systems. Such research is based on test collections, pre-defined test topics, and standard evaluation metrics. While recent research has emphasized the user viewpoint by proposing user-based metrics and non-binary relevance assessments, the methods are insufficient for truly user-based evaluation. The common assumption of a single query per topic and session poorly represents real life. On the other hand, one well-known metric for multiple queries per session, instance recall, does not capture early (within session) retrieval of (highly) relevant documents. We propose an extension to the Discounted Cumulated Gain (DCG) metric, the Session-based DCG (sDCG) metric for evaluation scenarios involving multiple query sessions, graded relevance assessments, and open-ended user effort including decisions to stop searching. The sDCG metric discounts relevant results from later queries within a session. We exemplify the sDCG metric with data from an interactive experiment, we discuss how the metric might be applied, and we present research questions for which the metric is helpful.