Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
Efficient multiple-click models in web search
Proceedings of the Second ACM International Conference on Web Search and Data Mining
A dynamic bayesian network click model for web search ranking
Proceedings of the 18th international conference on World wide web
Click chain model in web search
Proceedings of the 18th international conference on World wide web
Including summaries in system evaluation
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine
Proceedings of the third ACM international conference on Web search and data mining
Click-based evidence for decaying weight distributions in search effectiveness metrics
Information Retrieval
A comparative analysis of cascade measures for novelty and diversity
Proceedings of the fourth ACM international conference on Web search and data mining
System effectiveness, user models, and user utility: a conceptual framework for investigation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Model-based inference about IR systems
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Simulating simple user behavior for system effectiveness evaluation
Proceedings of the 20th ACM international conference on Information and knowledge management
Time-based calibration of effectiveness measures
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling user variance in time-biased gain
Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
Incorporating variability in user behavior into systems based evaluation
Proceedings of the 21st ACM international conference on Information and knowledge management
Click patterns: an empirical representation of complex query intents
Proceedings of the 21st ACM international conference on Information and knowledge management
Stochastic simulation of time-biased gain
Proceedings of the 21st ACM international conference on Information and knowledge management
Models and metrics: IR evaluation as a user process
Proceedings of the Seventeenth Australasian Document Computing Symposium
Model Based Comparison of Discounted Cumulative Gain and Average Precision
Journal of Discrete Algorithms
Summaries, ranked retrieval and sessions: a unified framework for information access evaluation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
User model-based metrics for offline query suggestion evaluation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
How query cost affects search behavior
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Click model-based information retrieval metrics
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Users versus models: what observation tells us about effectiveness metrics
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Increasing evaluation sensitivity to diversity
Information Retrieval
Exploiting user disagreement for web search evaluation: an experimental approach
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.01 |
Most information retrieval evaluation metrics are designed to measure the satisfaction of the user given the results returned by a search engine. In order to evaluate user satisfaction, most of these metrics have underlying user models, which aim at modeling how users interact with search engine results. Hence, the quality of an evaluation metric is a direct function of the quality of its underlying user model. This paper proposes EBU, a new evaluation metric that uses a sophisticated user model tuned by observations over many thousands of real search sessions. We compare EBU with a number of state of the art evaluation metrics and show that it is more correlated with real user behavior captured by clicks.