Prefetching based on web usage mining

Authors:
Daby M. Sow;David P. Olshefski;Mandis Beigi;Guruduth Banavar
Affiliations:
IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY
Venue:
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Year:
2003

Citing 18
Cited 1

Elements of information theory

Elements of information theory
Using speculation to reduce server load and service time on the WWW

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Optimal prefetching via data compression

Journal of the ACM (JACM)
Dummynet: a simple approach to the evaluation of network protocols

ACM SIGCOMM Computer Communication Review
An introduction to Kolmogorov complexity and its applications (2nd ed.)

An introduction to Kolmogorov complexity and its applications (2nd ed.)
Mining web logs for prediction models in WWW caching and prefetching

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Inferring client response time at the web server

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predicting web actions from HTML content

Proceedings of the thirteenth ACM conference on Hypertext and hypermedia
Stochastic Complexity in Statistical Inquiry Theory

Stochastic Complexity in Statistical Inquiry Theory
Machine Learning

Machine Learning
Computer Science Today: Recent Trends and Developments

Computer Science Today: Recent Trends and Developments
Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs

ADL '98 Proceedings of the Advances in Digital Libraries Conference
Web usage mining: discovery and applications of usage patterns from Web data

ACM SIGKDD Explorations Newsletter
Prefetching hyperlinks

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
Exploring the bounds of web latency reduction from caching and prefetching

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
A suboptimal lossy data compression based on approximate pattern matching

IEEE Transactions on Information Theory
Minimum description length induction, Bayesianism, and Kolmogorov complexity

IEEE Transactions on Information Theory
An adaptive network prefetch scheme

IEEE Journal on Selected Areas in Communications

Adaptive Web SitesA Knowledge Extraction from Web Data Approach

Proceedings of the 2008 conference on Adaptive Web Sites: A Knowledge Extraction from Web Data Approach

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new technique for prefetching web content by learning the access patterns of individual users. The prediction scheme for prefetching is based on a learning algorithm, called Fuzzy-LZ, which mines the history of user access and identifies patterns of recurring accesses. This algorithm is evaluated analytically via a metric called learnability and validated experimentally by correlating learnability with prediction accuracy. A web prefetching system that incorporates Fuzzy-LZ is described and evaluated. Our experiments demonstrate that Fuzzy-LZ prefetching provides a gain of 41.5 % in cache hit rate over pure caching. This gain is highest for those users who are neither highly predictable nor highly random, which turns out to be the vast majority of users in our workload. The overhead of our prefetching technique for a typical user is 2.4 prefetched pages per user request.