Adapting document ranking to users’ preferences using click-through data

Authors:
Min Zhao;Hang Li;Adwait Ratnaparkhi;Hsiao-Wuen Hon;Jue Wang
Affiliations:
Institute of Automation, Chinese Academy of Sciences, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Corporation, Redmond, WA;Microsoft Research Asia, Beijing, China;Institute of Automation, Chinese Academy of Sciences, Beijing, China
Venue:
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Year:
2006

Citing 19
Cited 3

Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting the performance of linearly combined IR systems

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Searching the Web: the public and their queries

Journal of the American Society for Information Science and Technology
Modeling score distributions for combining the outputs of search engines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic query expansion using query logs

Proceedings of the 11th international conference on World Wide Web
Expert agreement and content based reranking in a meta search environment using Mearf

Proceedings of the 11th international conference on World Wide Web
Machine Learning

Machine Learning
From E-Sex to E-Commerce: Web Search Changes

Computer
The Use of Implicit Evidence for Relevance Feedback in Web Retrieval

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Using terminological feedback for web search refinement: a log-based study

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
SIGIR 2003 workshop report: implicit measures of user interests and preferences

ACM SIGIR Forum
Context-sensitive information retrieval using implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Towards context-aware search by learning a very large variable length hidden markov model from search logs

Proceedings of the 18th international conference on World wide web
Reranking search results for sparse queries

Proceedings of the 20th ACM international conference on Information and knowledge management
A vlHMM approach to context-aware search

ACM Transactions on the Web (TWEB)

Quantified Score

Hi-index	0.02

Visualization

Abstract

This paper proposes a new approach to ranking the documents retrieved by a search engine using click-through data. The goal is to make the final ranked list of documents accurately represent users’ preferences reflected in the click-through data. Our approach combines the ranking result of a traditional IR algorithm (BM25) with that given by a machine learning algorithm (Naïve Bayes). The machine learning algorithm is trained on click-through data (queries and their associated documents), while the IR algorithm runs over the document collection. We consider several alternative strategies for combining the result of using click-through data and that of using document data. Experimental results confirm that any method of using click-through data greatly improves the preference ranking, over the method of using BM25 alone. We found that a linear combination of scores of Naïve Bayes and scores of BM25 performs the best for the task. At the same time, we found that the preference ranking methods can preserve relevance ranking, i.e., the preference ranking methods can perform as well as BM25 for relevance ranking.