The anatomy of a click: modeling user behavior on web information systems

  • Authors:
  • Kunal Punera;Srujana Merugu

  • Affiliations:
  • Yahoo! Research, Sunnyvale, CA, USA;Yahoo! Research, Sunnyvale, CA, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ultimate goal of information retrieval science continues to be providing relevant information to users while placing minimal cognitive load on them. The retrieval and presentation of relevant information (say, search results) as well as any dynamic system behavior (e.g., search engine re-ranking) depends acutely on estimating user intent. Hence, it is critical to use all the available information about user behavior at any stage of a search-session to accurately infer the user intent. However, the simplistic interfaces provided by search engines in order to minimize the user cognitive effort, and intrinsic limits imposed by privacy concerns, latency requirements, and other web instrumentation challenges, result in only a subset of user actions that are predictive of the search intent being captured. In this paper, we present a dynamic Bayesian network (DBN) that models user interaction with general web information systems, taking into account both observed (clicks etc.) as well as hidden (result examinations etc.) user actions. Our model goes beyond the ranked list information access paradigm and gives a solution where arbitrary context information can be incorporated in a principled fashion. To account for heterogeneity in user behavior as well as information access tasks, we further propose a bi-clustering algorithm that partitions users and tasks, and learns separate models for each bicluster. We instantiate this general DBN model for a typical static search interface comprising of a single query box and a ranked list of search results using a set of seven common user actions and various predictive state attributes. Experimental results on real-world web search log data indicate that one can obtain superior predictive performance on various session properties (such as click positions and reformulations) compared to simpler instantiations of the DBN.