Selecting related terms in query-logs using two-stage SimRank

Authors:
Yunlong Ma;Hongfei Lin;Yuan Lin
Affiliations:
Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China;Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China;Information Retrieval Laboratory, School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China, China
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 7
Cited 2

Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Regularized estimation of mixture models for robust pseudo-relevance feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A cluster-based resampling method for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
Query suggestions using query-flow graphs

Proceedings of the 2009 workshop on Web Search Click Data
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice

Query recommendation for children

Proceedings of the 21st ACM international conference on Information and knowledge management
QUBiC: An adaptive approach to query-based recommendation

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is commonly believed that query logs from Web search are a gold mine for search business, because they reflect users' preference over Web pages presented by search engines, so a lot of studies based on query logs have been carried out in the last few years. In this study, we assume that two queries are relevant to each other when they have same clicked page in their result lists, and we also consider the queries' topics of user's need. Thus, we propose a Two-Stage SimRank (called TSS in this paper) algorithm based on SimRank and some clustering algorithms to compute the similarity among queries, and then use it to discover relevant terms for query expansion, considering the information of topics and the global relationships of queries concurrently, with a query log collected by a practical search engine. Experimental results on two TREC test collections show that our approach can discover qualified terms effectively and improve retrieval performance.