Caching for Web Searching

Authors:
Bala Kalyanasundaram;John Noga;Kirk Pruhs;Gerhard J. Woeginger
Affiliations:
-;-;-;-
Venue:
SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
Year:
2000

Citing 10
Cited 0

Constructing competitive tours from local information

Theoretical Computer Science - Special issue on dynamic and on-line algorithms
Characterizing browsing strategies in the World-Wide Web

Proceedings of the Third International World-Wide Web conference on Technology, tools and applications
Competitive paging with locality of reference

Selected papers of the 23rd annual ACM symposium on Theory of computing
Page replacement with multi-size pages and applications to Web caching

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
How people revisit web pages: empirical findings and implications for the design of history systems

International Journal of Human-Computer Studies - Special issue: World Wide Web usability
Page replacement for general caching problems

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
On-line file caching

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Cache performance analysis of traversals and random accesses

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Cost-aware WWW proxy caching algorithms

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Markov paging

SFCS '92 Proceedings of the 33rd Annual Symposium on Foundations of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study web caching when the input sequence is a depth first search traversal of some tree. There are at least two good motivations for investigating tree traversal as a search technique on the WWW: First, empirical studies of people browsing and searching the WWW have shown that user access patterns commonly are nearly depth first traversals of some tree. Secondly, (as we will show in this paper) the problem of visiting all the pages on some WWW site using anchor clicks (clicks on links) and back button clicks -- by far the two most common user actions -- reduces to the problem of how to best cache a tree traversal sequence (up to constant factors). We show that for tree traversal sequences the optimal offline strategy can be computed efficiently. In the bit model, where the access time of a page is proportional to its size, we show that the online algorithm LRU is (1+1/Ɛ)-competitive against an adversary with unbounded cache as long as LRU has a cache of size at least (1+Ɛ) times the size of the largest item in the input sequence. In the general model, where pages have arbitrary access times and sizes, we show that in order to be constant competitive, any online algorithm needs a cache large enough to store Ω(log n) pages; here n is the number of distinct pages in the input sequence. We provide a matching upper bound by showing that the online algorithm Landlord is constant competitive against an adversary with an unbounded cache if Landlord has a cache large enough to store the Ω(log n) largest pages. This is further theoretical evidence that Landlord is the "right" algorithm for web caching.