Selecting related terms in query-logs using two-stage SimRank

  • Authors:
  • Yunlong Ma;Hongfei Lin;Yuan Lin

  • Affiliations:
  • Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China;Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China;Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is commonly believed that query logs from Web search are a gold mine for search business, because they reflect users' preference over Web pages presented by search engines, so a lot of studies based on query logs have been carried out in the last few years. In this study, we assume that two queries are relevant to each other when they have same clicked page in their result lists, and we also consider the queries' topics of user's need. Thus, we propose a Two-Stage SimRank (called TSS in this paper) algorithm based on SimRank and some clustering algorithms to compute the similarity among queries, and then use it to discover relevant terms for query expansion, considering the information of topics and the global relationships of queries concurrently, with a query log collected by a practical search engine. Experimental results on two TREC test collections show that our approach can discover qualified terms effectively and improve retrieval performance.