Expected reciprocal rank for graded relevance

Authors:
Olivier Chapelle;Donald Metlzer;Ya Zhang;Pierre Grinspan
Affiliations:
Yahoo! Labs, Santa Clara, CA, USA;Yahoo! Labs, Santa Clara, CA, USA;Yahoo! Labs, Sunnyvale, CA, USA;Google Inc, San Bruno, CA, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 18
Cited 107

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The Philosophy of Information Retrieval Evaluation

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Binary and graded relevance in IR evaluations: comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Minimal test collections for retrieval evaluation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting clicks: estimating the click-through rate for new ads

Proceedings of the 16th international conference on World Wide Web
Reliable information retrieval evaluation with incomplete and biased judgements

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Alternatives to Bpref

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The relationship between IR effectiveness measures and user satisfaction

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance

Journal of the American Society for Information Science and Technology
Inferring document relevance from incomplete information

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness

ACM Transactions on Information Systems (TOIS)
How does clickthrough data reflect retrieval quality?

Proceedings of the 17th ACM conference on Information and knowledge management
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
A dynamic bayesian network click model for web search ranking

Proceedings of the 18th international conference on World wide web

Generalized distances between rankings

Proceedings of the 19th international conference on World wide web
Relevance and ranking in online dating systems

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Bayesian Browsing Model: Exact Inference of Document Relevance from Petabyte-Scale Data

ACM Transactions on Knowledge Discovery from Data (TKDD)
Web search solved?: all result rankings the same?

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Exploiting site-level information to improve web search

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Expected browsing utility for web search evaluation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A comparative analysis of cascade measures for novelty and diversity

Proceedings of the fourth ACM international conference on Web search and data mining
Ranking from pairs and triplets: information quality, evaluation methods and query complexity

Proceedings of the fourth ACM international conference on Web search and data mining
Detecting duplicate web documents using clickthrough data

Proceedings of the fourth ACM international conference on Web search and data mining
Feature selection under learning to rank model for multimedia retrieve

ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Learning to rank with multiple objective functions

Proceedings of the 20th international conference on World wide web
Parallel boosted regression trees for web search ranking

Proceedings of the 20th international conference on World wide web
Evaluating new search engine configurations with pre-existing judgments and clicks

Proceedings of the 20th international conference on World wide web
On the informativeness of cascade and intent-aware effectiveness measures

Proceedings of the 20th international conference on World wide web
Efficient diversity-aware search

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Relevant knowledge helps in choosing right teacher: active query selection for ranking adaptation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Intent-aware search result diversification

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
System effectiveness, user models, and user utility: a conceptual framework for investigation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Evaluating diversified search results using per-intent graded relevance

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Evaluating multi-query sessions

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Pseudo test collections for learning web search ranking functions

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A robust ranking methodology based on diverse calibration of AdaBoost

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Model-based inference about IR systems

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Aggregated search result diversification

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Do subtopic judgments reflect diversity?

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Rank and relevance in novelty and diversity metrics for recommender systems

Proceedings of the fifth ACM conference on Recommender systems
Sparse spatial selection for novelty-based search result diversification

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Simulating simple user behavior for system effectiveness evaluation

Proceedings of the 20th ACM international conference on Information and knowledge management
Recency ranking by diversification of result set

Proceedings of the 20th ACM international conference on Information and knowledge management
Diverse retrieval via greedy optimization of expected 1-call@k in a latent subtopic relevance model

Proceedings of the 20th ACM international conference on Information and knowledge management
Intent-based diversification of web search results: metrics and algorithms

Information Retrieval
Large-scale validation and analysis of interleaved search evaluation

ACM Transactions on Information Systems (TOIS)
Flexible sample selection strategies for transfer learning in ranking

Information Processing and Management: an International Journal
Evaluation with informational and navigational intents

Proceedings of the 21st international conference on World Wide Web
Good abandonments in factoid queries

Proceedings of the 21st international conference companion on World Wide Web
Tuning parameters of the expected reciprocal rank

Proceedings of the 21st international conference companion on World Wide Web
Construct weak ranking functions for learning linear ranking function

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Learning to rank by optimizing expected reciprocal rank

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
New approaches to diversity and novelty in recommender systems

FDIA'11 Proceedings of the Fourth BCS-IRSG conference on Future Directions in Information Access
Top-k retrieval using facility location analysis

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Diversity by proportionality: an election-based approach to search result diversification

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Time-based calibration of effectiveness measures

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Evaluating aggregated search pages

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Personalization of search results using interaction behaviors in search sessions

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A semi-supervised approach to modeling web search satisfaction

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Social annotations: utility and prediction modeling

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Top-k learning to rank: labeling, ranking and evaluation

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Combining implicit and explicit topic representations for result diversification

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Using preference judgments for novel document retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Optimizing parameters of the expected reciprocal rank

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Advances on the development of evaluation measures

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Coverage-based search result diversification

Information Retrieval
On the role of novelty for search result diversification

Information Retrieval
Incorporating variability in user behavior into systems based evaluation

Proceedings of the 21st ACM international conference on Information and knowledge management
The effect of aggregated search coherence on search behavior

Proceedings of the 21st ACM international conference on Information and knowledge management
Content-based relevance estimation on the web using inter-document similarities

Proceedings of the 21st ACM international conference on Information and knowledge management
A comprehensive analysis of parameter settings for novelty-biased cumulative gain

Proceedings of the 21st ACM international conference on Information and knowledge management
Stochastic simulation of time-biased gain

Proceedings of the 21st ACM international conference on Information and knowledge management
Active evaluation of ranking functions based on graded relevance

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Effects of spam removal on search engine efficiency and effectiveness

Proceedings of the Seventeenth Australasian Document Computing Symposium
Reordering an index to speed query processing without loss of effectiveness

Proceedings of the Seventeenth Australasian Document Computing Symposium
Models and metrics: IR evaluation as a user process

Proceedings of the Seventeenth Australasian Document Computing Symposium
Model Based Comparison of Discounted Cumulative Gain and Average Precision

Journal of Discrete Algorithms
Modelling efficient novelty-based search result diversification in metric spaces

Journal of Discrete Algorithms
Differences in search engine evaluations between query owners and non-owners

Proceedings of the sixth ACM international conference on Web search and data mining
Unifying rating-oriented and ranking-oriented collaborative filtering for improved recommendation

Information Sciences: an International Journal
Using intent information to model user behavior in diversified search

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Ranked accuracy and unstructured distributed search

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Is intent-aware expected reciprocal rank sufficient to evaluate diversity?

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Deciding on an adjustment for multiplicity in IR experiments

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
User model-based metrics for offline query suggestion evaluation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A novel TF-IDF weighting scheme for effective ranking

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Click model-based information retrieval metrics

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A mutual information-based framework for the analysis of information retrieval systems

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Search result diversification in resource selection for federated search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Preference based evaluation measures for novelty and diversity

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Term level search result diversification

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Modeling term dependencies with quantum language models for IR

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Time-aware structured query suggestion

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Diversified recommendation on graphs: pitfalls, measures, and algorithms

Proceedings of the 22nd international conference on World Wide Web
Selecting effective expansion terms for diversity

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Evaluating bad query abandonment in an iterative SMS-based FAQ retrieval system

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
GAPfm: optimal top-n recommendations for graded relevance domains

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Is top-k sufficient for ranking?

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
On the reliability and intuitiveness of aggregated search metrics

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Evaluating aggregated search using interleaving

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Robust models of mouse movement on dynamic web search results pages

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
xCLiMF: optimizing expected reciprocal rank for data with multiple levels of relevance

Proceedings of the 7th ACM conference on Recommender systems
Active evaluation of ranking functions based on graded relevance

Machine Learning
Users versus models: what observation tells us about effectiveness metrics

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Term associations in query expansion: a structural linguistic perspective

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Diversified query expansion using conceptnet

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Learning to rank query suggestions for adhoc and diversity search

Information Retrieval
Increasing evaluation sensitivity to diversity

Information Retrieval
Improving contextual advertising by adopting collaborative filtering

ACM Transactions on the Web (TWEB)
The water filling model and the cube test: multi-dimensional evaluation for professional search

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods

ACM Transactions on Information Systems (TOIS)
Active evaluation of ranking functions based on graded relevance (extended abstract)

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Democracy is good for ranking: towards multi-view rank learning and adaptation in web search

Proceedings of the 7th ACM international conference on Web search and data mining
The whens and hows of learning to rank for web search

Information Retrieval
Contextual and dimensional relevance judgments for reusable SERP-level evaluation

Proceedings of the 23rd international conference on World wide web
Report on the SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation (MUBE 2013)

ACM SIGIR Forum
Improving ranking performance with cost-sensitive ordinal classification via regression

Information Retrieval
Leveraging integrated information to extract query subtopics for search result diversification

Information Retrieval
Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Machine Learning
Calibration and regret bounds for order-preserving surrogate losses in learning to rank

Machine Learning
Evaluation in Music Information Retrieval

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted Cumulative Gain (DCG). A drawback of DCG is its additive nature and the underlying independence assumption: a document in a given position has always the same gain and discount independently of the documents shown above it. Inspired by the "cascade" user model, we present a new editorial metric for graded relevance which overcomes this difficulty and implicitly discounts documents which are shown below very relevant documents. More precisely, this new metric is defined as the expected reciprocal length of time that the user will take to find a relevant document. This can be seen as an extension of the classical reciprocal rank to the graded relevance case and we call this metric Expected Reciprocal Rank (ERR). We conduct an extensive evaluation on the query logs of a commercial search engine and show that ERR correlates better with clicks metrics than other editorial metrics.