Model Based Comparison of Discounted Cumulative Gain and Average Precision

Authors:
Georges Dupret;Benjamin Piwowarski
Affiliations:
Yahoo! Labs, 4E4363, 701 First Avenue, Sunnyvale, CA, United States;CNRS, University Paris 6, Paris, France
Venue:
Journal of Discrete Algorithms
Year:
2013

Citing 22
Cited 1

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Eye-tracking analysis of user behavior in WWW search

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Evaluation in (XML) information retrieval: expected precision-recall with user modelling (EPRUM)

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A user browsing model to predict search engine click data from past observations.

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A new interpretation of average precision

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness

ACM Transactions on Information Systems (TOIS)
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Click chain model in web search

Proceedings of the 18th international conference on World wide web
Including summaries in system evaluation

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Empirical justification of the gain and discount function for nDCG

Proceedings of the 18th ACM conference on Information and knowledge management
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
Click-based evidence for decaying weight distributions in search effectiveness metrics

Information Retrieval
A user behavior model for average precision and its generalization to graded judgments

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Extending average precision to graded relevance judgments

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Learning click models via probit bayesian inference

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Expected browsing utility for web search evaluation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
System effectiveness, user models, and user utility: a conceptual framework for investigation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Discounted cumulative gain and user decision models

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Simulating simple user behavior for system effectiveness evaluation

Proceedings of the 20th ACM international conference on Information and knowledge management
INEX 2005 evaluation measures

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval

Modeling behavioral factors ininteractive information retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose to explain Discounted Cumulative Gain (DCG) as the expectation of the total utility collected by a user given a generative probabilistic model on how users browse the result page ranking list of a search engine. We contrast this with a generalization of Average Precision, pAP, that has been defined in Dupret and Piwowarski (2010) [13]. In both cases, user decision models coupled with Web search logs allow to estimate some parameters that are usually left to the designer of a metric. In this paper, we compare the user models for DCG and pAP at the interpretation and experimental level. DCG and AP are metrics computed before a ranking function is exposed to users and as such, their role is to predict the function performance. In counterpart to prognostic metric, a diagnostic metric is computed after observing the user interactions with the result list. A commonly used diagnostic metric is the clickthrough rate at position 1, for example. In this work we show that the same user model developed for DCG can be used to derive a diagnostic version of this metric. The same hold for pAP and any metric with a proper user model. We show that not only does this diagnostic view provide new information, it also allows to define a new criterion for assessing a metric. In previous works based on user decision modeling, the performance of different metrics were compared indirectly in terms of the ability of the associated user model to predict future user actions. Here we propose a new and more direct criterion based on the ability of the prognostic version of the metric to predict the diagnostic performance.