A prediction model for web search hit counts using word frequencies

  • Authors:
  • Tian Tian;Soon Ae Chun;James Geller

  • Affiliations:
  • New Jersey Institute of Technology, USA;City University of New York, USA;New Jersey Institute of Technology, USA

  • Venue:
  • Journal of Information Science
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A search engine user with a well-defined information need is not interested in getting thousands of hits, but a few hits that are all highly relevant to their search. Often search words need to be refined and augmented to narrow results to more relevant pages. However, an overly specific query may lead to no hits at all, while most typical queries lead to thousands or even millions of them, both undesirable outcomes. This paper suggests a query rewriting method for generating alternative query strings and proposes a hit count prediction model for predicting the number of search engine hits for each alternative query string, based on the English language frequencies of the words in the search terms. Using the hit count prediction model, different types of search strategies, such as a lowest hit count query preference, can be utilized to improve users' search experience. We present an evaluation experiment of the hit count prediction model for three major search engines. We also discuss and quantify how far the Google, Yahoo! and Bing search engines diverge from monotonic behaviour, considering negative and positive search terms separately.