Large scale query log analysis of re-finding

  • Authors:
  • Sarah K. Tyler;Jaime Teevan

  • Affiliations:
  • University of California, Santa Cruz, Santa Cruz, CA, USA;Microsoft Research, Redmond, WA, USA

  • Venue:
  • Proceedings of the third ACM international conference on Web search and data mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although Web search engines are targeted towards helping people find new information, people regularly use them to re-find Web pages they have seen before. Researchers have noted the existence of this phenomenon, but relatively little is understood about how re-finding behavior differs from the finding of new information. This paper dives deeply into the differences via analysis of three large-scale data sources: 1) query logs (queries, clicks, result impressions), 2) Web browsing logs (URL visits), and 3) a daily Web crawl (page content). It appears that people learn valuable information about the pages they find that helps them re-find what they are looking for later; compared to the initial finding query, re-finding queries are typically shorter, and rank the re-found URL higher. While many instances of re-finding probably serve as a type of bookmark for a known URL, others seem to represent the resumption of a previous task; results clicked at the end of a session are more likely than those at the beginning to be re-found during a later session, while re-finding is more likely to happen at the beginning of a session than at the end. Additionally, we observe differences in cross-session and intra-session re-finding that may indicate different types of re-finding tasks. Our findings suggest there is a rich opportunity for search engines to take advantage of re-finding behavior as a means to improve the search experience.