Cross Domain Random Walk for Query Intent Pattern Mining from Search Engine Log

  • Authors:
  • Siyu Gu;Jun Yan;Lei Ji;Shuicheng Yan;Junshi Huang;Ning Liu;Ying Chen;Zheng Chen

  • Affiliations:
  • -;-;-;-;-;-;-;-

  • Venue:
  • ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding search intents of users through their condensed short queries has attracted much attention both in academia and industry. The search intents of users are generally assumed to be associated with various query patterns, such as "MobileName price", where "MobileName" could be any named entity of mobile phone model and this pattern indicates that the user intends to buy a mobile phone. However, discovering the query intent patterns for general search is challenging mainly due to the difficulty in collecting sufficient training data for learning query patterns across a large number of searchable domains. In this work, we propose Cross Domain Random Walk (CDRW) algorithm, which is semi-supervised, to discover the query intent patterns across different domains from search engine click-through log data. Starting with some manually tagged seed queries in one or more independent domains, CDRW takes the query patterns as bridge and propagates the transition probability across domains to collect the query intent patterns among different domains based on the assumption that "users who have similar intent in different but similar domains will have high probability to share similar query patterns across domains". Different from classical random walk algorithms, CDRW walks across different domains to disseminate the shared knowledge in a transfer learning manner. Extensive experiment results on real log data of a commercial search engine well validate the effectiveness and efficiency of the proposed algorithm.