Predicting query reformulation during web searching

Authors:
Bernard J. Jansen;Danielle Booth;Amanda Spink
Affiliations:
The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA;Queensland University of Technology, Brisbane, PQ, Australia
Venue:
CHI '09 Extended Abstracts on Human Factors in Computing Systems
Year:
2009

Citing 8
Cited 2

A prediction system for multimedia pre-fetching in Internet

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A review of web searching studies and a framework for future research

Journal of the American Society for Information Science and Technology
Combining evidence for automatic web session identification

Information Processing and Management: an International Journal - Issues of context in information retrieval
Using terminological feedback for web search refinement: a log-based study

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Query length in interactive information retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining longitudinal web queries: trends and patterns

Journal of the American Society for Information Science and Technology
If not now, when?: the effects of interruption at different moments within task execution

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Analysis of multiple query reformulations on the web: the interactive information retrieval context

Information Processing and Management: an International Journal

Web searching interaction model based on user cognitive styles

Proceedings of the 22nd Conference of the Computer-Human Interaction Special Interest Group of Australia on Computer-Human Interaction
Find it if you can: a game for modeling different types of web search success using interaction data

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

his paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.