A user browsing model to predict search engine click data from past observations.

  • Authors:
  • Georges E. Dupret;Benjamin Piwowarski

  • Affiliations:
  • Yahoo! Research Latin America, Santiago, Chile;Yahoo! Research Latin America, Santiago, Chile

  • Venue:
  • Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. [12]. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.