Search and Ranking Algorithms for Locating Resources on the World Wide Web

  • Authors:
  • Budi Yuwono;Dik Lun Lee

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applying information retrieval techniques to the World Wide Web (WWW) environment is a challenge, mostly because of its hypertext/hypermedia nature and the richness of the meta-information it provides. We present four keyword-based search and ranking algorithms for locating relevant WWW pages with respect to user queries. The first algorithm, Boolean Spreading Activation, extends the notion of word occurrence in the Boolean retrieval model by propagating the occurrence of a query word in a page to other pages linked to it. The second algorithm, Most-cited, uses the number of citing hyperlinks between potentially relevant WWW pages to increase the relevance scores of the referenced pages over the referencing pages. The third algorithm, TFxIDF vector space model, is based on word distribution statistics. The last algorithm, Vector Spreading Activation, combines TFxIDF with the spreading activation model. We conducted an experiment to evaluate the retrieval effectiveness of these algorithms. From the results of the experiment, we draw conclusions regarding the nature of the WWW environment with respect to document ranking strategies.