Improving Web search efficiency via a locality based static pruning method

  • Authors:
  • Edleno S. de Moura;Célia F. dos Santos;Daniel R. Fernandes;Altigran S. Silva;Pavel Calado;Mario A. Nascimento

  • Affiliations:
  • Federal University of Amazonas, Brazil;Federal University of Amazonas, Brazil;Federal University of Amazonas, Brazil;Federal University of Amazonas, Brazil;INESC-ID, Portugal;University of Alberta, Canada

  • Venue:
  • WWW '05 Proceedings of the 14th international conference on World Wide Web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The unarguably fast, and continuous, growth of the volume of indexed (and indexable) documents on the Web poses a great challenge for search engines. This is true regarding not only search effectiveness but also time and space efficiency. In this paper we present an index pruning technique targeted for search engines that addresses the latter issue without disconsidering the former. To this effect, we adopt a new pruning strategy capable of greatly reducing the size of search engine indices. Experiments using a real search engine show that our technique can reduce the indices' storage costs by up to 60% over traditional lossless compression methods, while keeping the loss in retrieval precision to a minimum. When compared to the indices size with no compression at all, the compression rate is higher than 88%, i.e., less than one eighth of the original size. More importantly, our results indicate that, due to the reduction in storage overhead, query processing time can be reduced to nearly 65% of the original time, with no loss in average precision. The new method yields significative improvements when compared against the best known static pruning method for search engine indices. In addition, since our technique is orthogonal to the underlying search algorithms, it can be adopted by virtually any search engine.