A hybrid approach for refreshing web page repositories

  • Authors:
  • M. Ghodsi;O. Hassanzadeh;Sh. Kamali;M. Monemizadeh

  • Affiliations:
  • Computer Engineering Department, Sharif University of Technology, Tehran, Iran;Computer Engineering Department, Sharif University of Technology, Tehran, Iran;Computer Engineering Department, Sharif University of Technology, Tehran, Iran;Computer Engineering Department, Sharif University of Technology, Tehran, Iran

  • Venue:
  • DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web pages change frequently and thus crawlers have to download them often. Various policies have been proposed for refreshing local copies of web pages. In this paper, we introduce a new sampling method that excels over other change detection methods in experiment. Change Frequency (CF) is a method that predicts the change frequency of the pages and, in the long run, achieves an optimal efficiency in comparison with the sampling method. Here, we propose a new hybrid method that is a combination of our new sampling approach and CF and show how our hybrid method improves the efficiency of change detection.