Efficient query recommendations in the long tail via center-piece subgraphs

Authors:
Francesco Bonchi;Raffaele Perego;Fabrizio Silvestri;Hossein Vahabi;Rossano Venturini
Affiliations:
Yahoo! Research, Barcelona, Spain;ISTI-CNR, Pisa, Italy;ISTI-CNR, Pisa, Italy;IMT, Lucca, Italy;Dept. of Computer Science, University of Pisa, Pisa, Italy
Venue:
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Year:
2012

Citing 15
Cited 5

Concept-based interactive query expansion

Proceedings of the 14th ACM international conference on Information and knowledge management
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Center-piece subgraphs: problem definition and fast solutions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Heads and tails: studies of web search with common and rare queries

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting semantic relations from query logs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Design trade-offs for search engine caching

ACM Transactions on the Web (TWEB)
Query suggestion using hitting time

Proceedings of the 17th ACM conference on Information and knowledge management
The query-flow graph: model and applications

Proceedings of the 17th ACM conference on Information and knowledge management
Online expansion of rare queries for sponsored search

Proceedings of the 18th international conference on World wide web
Optimal rare query suggestion with implicit user feedback

Proceedings of the 19th international conference on World wide web
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning similarity function for rare queries

Proceedings of the fourth ACM international conference on Web search and data mining
Improving recommendation for long-tail queries via templates

Proceedings of the 20th international conference on World wide web
Query reformulation mining: models, patterns, and applications

Information Retrieval
Synthesizing high utility suggestions for rare web search queries

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

From machu_picchu to "rafting the urubamba river": anticipating information needs via the entity-query graph

Proceedings of the sixth ACM international conference on Web search and data mining
Task-aware query recommendation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Graph-of-word and TW-IDF: new approach to ad hoc IR

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Penguins in sweaters, or serendipitous entity search on user-generated content

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Orthogonal query recommendation

Proceedings of the 7th ACM conference on Recommender systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a recommendation method based on the well-known concept of center-piece subgraph, that allows for the time/space efficient generation of suggestions also for rare, i.e., long-tail queries. Our method is scalable with respect to both the size of datasets from which the model is computed and the heavy workloads that current web search engines have to deal with. Basically, we relate terms contained into queries with highly correlated queries in a query-flow graph. This enables a novel recommendation generation method able to produce recommendations for approximately 99% of the workload of a real-world search engine. The method is based on a graph having term nodes, query nodes, and two kinds of connections: term-query and query-query. The first connects a term to the queries in which it is contained, the second connects two query nodes if the likelihood that a user submits the second query after having issued the first one is sufficiently high. On such large graph we need to compute the center-piece subgraph induced by terms contained into queries. In order to reduce the cost of the above computation, we introduce a novel and efficient method based on an inverted index representation of the model. We experiment our solution on two real-world query logs and we show that its effectiveness is comparable (and in some case better) than state-of-the-art methods for head-queries. More importantly, the quality of the recommendations generated remains very high also for long-tail queries, where other methods fail even to produce any suggestion. Finally, we extensively investigate scalability and efficiency issues and we show the viability of our method in real world search engines.