From search session detection to search mission detection

  • Authors:
  • Matthias Hagen;Jakob Gomoll;Anna Beyer;Benno Stein

  • Affiliations:
  • Bauhaus-Universität Weimar, Weimar, Germany;Bauhaus-Universität Weimar, Weimar, Germany;Bauhaus-Universität Weimar, Weimar, Germany;Bauhaus-Universität Weimar, Weimar, Germany

  • Venue:
  • Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Search mission detection aims at identifying those queries a user submits for the same information need. Such knowledge offers interesting insights into behavioral usage patterns and often can help to better support a user. However, most existing query log studies focus on search sessions only (consecutive queries for the same need) and ignore multitasking behavior (interleaved information needs) as well as hierarchies of short-term search goals in multiple sessions that form a long-term search task such as vacation planning. To better understand the dialog between user and search engine we distinguish between (1) physical search sessions, characterized by the time gap between queries, (2) logical search sessions, characterized by consecutive queries for the same information need within a physical session, and (3) search missions, characterized by logical sessions, multitasking behavior, and hierarchical goals. Our contributions are threefold. First, we present a new algorithm for logical session detection, which follows the state-of-the-art cascading method's rationale of combining effectiveness with efficiency. Our approach is applicable within the time-critical online scenario, where a search engine tries to support users by incorporating knowledge about their search history on the fly, as well as within the offline scenario, where the objective is to accurately partition a collected log. We improve several steps of the cascading method, among others by exploiting Linked Open Data information. Second, we demonstrate our new algorithm's applicability to accurately detect search missions. Third, we introduce a new publicly available corpus of 8800 queries labeled with session and mission information.