Time-based calibration of effectiveness measures

Authors:
Mark D. Smucker;Charles L.A. Clarke
Affiliations:
University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada
Venue:
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Year:
2012

Citing 35
Cited 18

Incremental relevance feedback

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation measures for interactive information retrieval

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Document length normalization

Information Processing and Management: an International Journal - Special issue: history of information science
Time, relevance and interaction modelling for information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Advantages of query biased summaries in information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Why batch and user evaluations do not give the same results

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
On the Resemblance and Containment of Documents

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Evaluating implicit feedback models using searcher simulations

ACM Transactions on Information Systems (TOIS)
Evaluating evaluation metrics based on the bootstrap

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
On GMAP: and other transformations

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Hits on the web: how does it compare?

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A probability ranking principle for interactive information retrieval

Information Retrieval
Eye tracking and online search: Lessons learned and challenges ahead

Journal of the American Society for Information Science and Technology
How do users find things with PubMed?: towards automatic utility evaluation with user simulations

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A new interpretation of average precision

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness

ACM Transactions on Information Systems (TOIS)
Including summaries in system evaluation

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Methods for Evaluating Interactive Information Retrieval Systems with Users

Methods for Evaluating Interactive Information Retrieval Systems with Users
Modeling Expected Utility of Multi-session Information Distillation

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
Test Collection-Based IR Evaluation Needs Extension toward Sessions --- A Case of Extremely Short Queries

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Click-based evidence for decaying weight distributions in search effectiveness metrics

Information Retrieval
A user behavior model for average precision and its generalization to graded judgments

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Human performance and retrieval precision revisited

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Extending average precision to graded relevance judgments

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Expected browsing utility for web search evaluation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Expected reading effort in focused retrieval evaluation

Information Retrieval
Report on the SIGIR 2010 workshop on the simulation of interaction

ACM SIGIR Forum
A comparative analysis of cascade measures for novelty and diversity

Proceedings of the fourth ACM international conference on Web search and data mining
The economics in interactive information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
System effectiveness, user models, and user utility: a conceptual framework for investigation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Discounted cumulative gain and user decision models

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Simulating simple user behavior for system effectiveness evaluation

Proceedings of the 20th ACM international conference on Information and knowledge management

Modeling user variance in time-biased gain

Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
Stochastic simulation of time-biased gain

Proceedings of the 21st ACM international conference on Information and knowledge management
Models and metrics: IR evaluation as a user process

Proceedings of the Seventeenth Australasian Document Computing Symposium
HCIR 2012: the sixth international symposium on human-computer interaction and information retrieval

ACM SIGIR Forum
Tempo of search actions to modeling successful sessions

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Summaries, ranked retrieval and sessions: a unified framework for information access evaluation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
How query cost affects search behavior

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Click model-based information retrieval metrics

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A general evaluation measure for document organization tasks

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
The seventeenth australasian document computing symposium

ACM SIGIR Forum
Editorial: Introduction to special issue on human-computer information retrieval

Information Processing and Management: an International Journal
Modeling behavioral factors ininteractive information retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Users versus models: what observation tells us about effectiveness metrics

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
The water filling model and the cube test: multi-dimensional evaluation for professional search

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Slow Search: Information Retrieval without Time Constraints

Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
Report on the SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation (MUBE 2013)

ACM SIGIR Forum
Evaluation in Music Information Retrieval

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many current effectiveness measures incorporate simplifying assumptions about user behavior. These assumptions prevent the measures from reflecting aspects of the search process that directly impact the quality of retrieval results as experienced by the user. In particular, these measures implicitly model users as working down a list of retrieval results, spending equal time assessing each document. In reality, even a careful user, intending to identify as much relevant material as possible, must spend longer on some documents than on others. Aspects such as document length, duplicates and summaries all influence the time required. In this paper, we introduce a time-biased gain measure, which explicitly accommodates such aspects of the search process. By conducting an appropriate user study, we calibrate and validate the measure against the TREC 2005 Robust Track test collection. We examine properties of the measure, contrasting it to traditional effectiveness measures, and exploring its extension to other aspects and environments. As its primary benefit, the measure allows us to evaluate system performance in human terms, while maintaining the simplicity and repeatability of system-oriented tests. Overall, we aim to achieve a clearer connection between user-oriented studies and system-oriented tests, allowing us to better transfer insights and outcomes from one to the other.