Prefetching based on web usage mining

  • Authors:
  • Daby M. Sow;David P. Olshefski;Mandis Beigi;Guruduth Banavar

  • Affiliations:
  • IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY

  • Venue:
  • Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a new technique for prefetching web content by learning the access patterns of individual users. The prediction scheme for prefetching is based on a learning algorithm, called Fuzzy-LZ, which mines the history of user access and identifies patterns of recurring accesses. This algorithm is evaluated analytically via a metric called learnability and validated experimentally by correlating learnability with prediction accuracy. A web prefetching system that incorporates Fuzzy-LZ is described and evaluated. Our experiments demonstrate that Fuzzy-LZ prefetching provides a gain of 41.5 % in cache hit rate over pure caching. This gain is highest for those users who are neither highly predictable nor highly random, which turns out to be the vast majority of users in our workload. The overhead of our prefetching technique for a typical user is 2.4 prefetched pages per user request.