Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
The Philosophy of Information Retrieval Evaluation
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
An experimental comparison of click position-bias models
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A user browsing model to predict search engine click data from past observations.
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?
Proceedings of the 17th ACM conference on Information and knowledge management
A dynamic bayesian network click model for web search ranking
Proceedings of the 18th international conference on World wide web
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Do user preferences and evaluation measures line up?
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing the sensitivity of information retrieval metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Expected browsing utility for web search evaluation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Context-sensitive query auto-completion
Proceedings of the 20th international conference on World wide web
Online spelling correction for query completion
Proceedings of the 20th international conference on World wide web
Inferring and using location metadata to personalize web search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
User-click modeling for understanding and predicting search-behavior
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Personalizing web search results by reading level
Proceedings of the 20th ACM international conference on Information and knowledge management
Intent-based diversification of web search results: metrics and algorithms
Information Retrieval
Query recommendation using query logs in search engines
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Actualization of query suggestions using query logs
Proceedings of the 21st international conference companion on World Wide Web
Time-sensitive query auto-completion
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Incorporating variability in user behavior into systems based evaluation
Proceedings of the 21st ACM international conference on Information and knowledge management
More than relevance: high utility query recommendation by mining users' search behaviors
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Query suggestion or auto-completion mechanisms are widely used by search engines and are increasingly attracting interest from the research community. However, the lack of commonly accepted evaluation methodology and metrics means that it is not possible to compare results and approaches from the literature. Moreover, often the metrics used to evaluate query suggestions tend to be an adaptation from other domains without a proper justification. Hence, it is not necessarily clear if the improvements reported in the literature would result in an actual improvement in the users' experience. Inspired by the cascade user models and state-of-the-art evaluation metrics in the web search domain, we address the query suggestion evaluation, by first studying the users behaviour from a search engine's query log and thereby deriving a new family of user models describing the users interaction with a query suggestion mechanism. Next, assuming a query log-based evaluation approach, we propose two new metrics to evaluate query suggestions, pSaved and eSaved. Both metrics are parameterised by a user model. pSaved is defined as the probability of using the query suggestions while submitting a query. eSaved equates to the expected relative amount of effort (keypresses) a user can avoid due to the deployed query suggestion mechanism. Finally, we experiment with both metrics using four user model instantiations as well as metrics previously used in the literature on a dataset of 6.1M sessions. Our results demonstrate that pSaved and eSaved show the best alignment with the users satisfaction amongst the considered metrics.