Query reformulation mining: models, patterns, and applications

Authors:
Paolo Boldi;Francesco Bonchi;Carlos Castillo;Sebastiano Vigna
Affiliations:
DSI, Università degli studi di Milano, Milan, Italy 20135;Yahoo! Research, Barcelona, Spain 080018;Yahoo! Research, Barcelona, Spain 080018;DSI, Università degli studi di Milano, Milan, Italy 20135
Venue:
Information Retrieval
Year:
2011

Citing 40
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Page and link classifications: connecting diverse resources

Proceedings of the third ACM conference on Digital libraries
Real life information retrieval: a study of user queries on the Web

ACM SIGIR Forum
Patterns of search: analyzing and modeling Web query refinement

UM '99 Proceedings of the seventh international conference on User modeling
Helping people find what they don't know

Communications of the ACM
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Community search assistant

Proceedings of the 6th international conference on Intelligent user interfaces
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
From E-Sex to E-Commerce: Web Search Changes

Computer
Combining evidence for automatic web session identification

Information Processing and Management: an International Journal - Issues of context in information retrieval
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Using Association Rules to Discover Search Engines Related Queries

LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
The webgraph framework I: compression techniques

Proceedings of the 13th international conference on World Wide Web
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
KDD CUP-2005 report: facing a great challenge

ACM SIGKDD Explorations Newsletter
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Mining search engine query logs for query recommendation

Proceedings of the 15th international conference on World Wide Web
A reference collection for web spam

ACM SIGIR Forum
Query Modifications Patterns During Web Searching

ITNG '07 Proceedings of the International Conference on Information Technology
Random walks on the click graph

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Comparing query logs and pseudo-relevance feedbackfor web-search query refinement

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting semantic relations from query logs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Using the wisdom of the crowds for keyword generation

Proceedings of the 17th international conference on World Wide Web
Learning about the world through long-term query logs

ACM Transactions on the Web (TWEB)
Query suggestion using hitting time

Proceedings of the 17th ACM conference on Information and knowledge management
The query-flow graph: model and applications

Proceedings of the 17th ACM conference on Information and knowledge management
Matching task profiles and user needs in personalized web search

Proceedings of the 17th ACM conference on Information and knowledge management
Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs

Proceedings of the 17th ACM conference on Information and knowledge management
Query suggestions using query-flow graphs

Proceedings of the 2009 workshop on Web Search Click Data
From "Dango" to "Japanese Cakes": Query Reformulation Models and Patterns

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Aging effects on query flow graphs for query suggestion

Proceedings of the 18th ACM conference on Information and knowledge management
Analysis of multiple query reformulations on the web: The interactive information retrieval context

Information Processing and Management: an International Journal
An optimization framework for query recommendation

Proceedings of the third ACM international conference on Web search and data mining
A state transition analysis of image search patterns on the web

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Do you want to take notes?: identifying research missions in Yahoo! search pad

Proceedings of the 19th international conference on World wide web
Clustering query refinements by user intent

Proceedings of the 19th international conference on World wide web
Query similarity by projecting the query-flow graph

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
The effects of time on query flow graph-based models for query suggestion

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Query recommendation using query logs in search engines

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology

Explaining query modifications: an alternative interpretation of term addition and removal

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Efficient query recommendations in the long tail via center-piece subgraphs

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

Understanding query reformulation patterns is a key task towards next generation web search engines. If we can do that, then we can build systems able to understand and possibly predict user intent, providing the needed assistance at the right time, and thus helping users locate information more effectively and improving their web-search experience. As a step in this direction, we build a very accurate model for classifying user query reformulations into broad classes (generalization, specialization, error correction or parallel move), achieving 92% accuracy. We then apply the model to automatically label two very large query logs sampled from different geographic areas, and containing a total of approximately 17 million query reformulations. We study the resulting reformulation patterns, matching some results from previous studies performed on smaller manually annotated datasets, and discovering new interesting reformulation patterns, including connections between reformulation types and topical categories. We annotate two large query-flow graphs with reformulation type information, and run several graph-characterization experiments on these graphs, extracting new insights about the relationships between the different query reformulation types. Finally we study query recommendations based on short random walks on the query-flow graphs. Our experiments show that these methods can match in precision, and often improve, recommendations based on query-click graphs, without the need of users' clicks. Our experiments also show that it is important to consider transition-type labels on edges for having recommendations of good quality.