User model-based metrics for offline query suggestion evaluation

Authors:
Eugene Kharitonov;Craig Macdonald;Pavel Serdyukov;Iadh Ounis
Affiliations:
Yandex & Glasgow University, Moscow, Russian Fed.;Glasgow University, Glasgow, United Kingdom;Yandex, Moscow, Russian Fed.;Glasgow University, Glasgow, United Kingdom
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 22
Cited 0

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The Philosophy of Information Retrieval Evaluation

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A user browsing model to predict search engine click data from past observations.

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?

Proceedings of the 17th ACM conference on Information and knowledge management
A dynamic bayesian network click model for web search ranking

Proceedings of the 18th international conference on World wide web
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
Do user preferences and evaluation measures line up?

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing the sensitivity of information retrieval metrics

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Expected browsing utility for web search evaluation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Context-sensitive query auto-completion

Proceedings of the 20th international conference on World wide web
Online spelling correction for query completion

Proceedings of the 20th international conference on World wide web
Inferring and using location metadata to personalize web search

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
User-click modeling for understanding and predicting search-behavior

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Personalizing web search results by reading level

Proceedings of the 20th ACM international conference on Information and knowledge management
Intent-based diversification of web search results: metrics and algorithms

Information Retrieval
Query recommendation using query logs in search engines

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Actualization of query suggestions using query logs

Proceedings of the 21st international conference companion on World Wide Web
Time-sensitive query auto-completion

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Incorporating variability in user behavior into systems based evaluation

Proceedings of the 21st ACM international conference on Information and knowledge management
More than relevance: high utility query recommendation by mining users' search behaviors

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query suggestion or auto-completion mechanisms are widely used by search engines and are increasingly attracting interest from the research community. However, the lack of commonly accepted evaluation methodology and metrics means that it is not possible to compare results and approaches from the literature. Moreover, often the metrics used to evaluate query suggestions tend to be an adaptation from other domains without a proper justification. Hence, it is not necessarily clear if the improvements reported in the literature would result in an actual improvement in the users' experience. Inspired by the cascade user models and state-of-the-art evaluation metrics in the web search domain, we address the query suggestion evaluation, by first studying the users behaviour from a search engine's query log and thereby deriving a new family of user models describing the users interaction with a query suggestion mechanism. Next, assuming a query log-based evaluation approach, we propose two new metrics to evaluate query suggestions, pSaved and eSaved. Both metrics are parameterised by a user model. pSaved is defined as the probability of using the query suggestions while submitting a query. eSaved equates to the expected relative amount of effort (keypresses) a user can avoid due to the deployed query suggestion mechanism. Finally, we experiment with both metrics using four user model instantiations as well as metrics previously used in the literature on a dataset of 6.1M sessions. Our results demonstrate that pSaved and eSaved show the best alignment with the users satisfaction amongst the considered metrics.