Optimal Web cache sizing: scalable methods for exact solutions

  • Authors:
  • T. Kelly;D. Reeves

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA;Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA

  • Venue:
  • Computer Communications
  • Year:
  • 2001

Quantified Score

Hi-index 0.24

Visualization

Abstract

This paper describes two approaches to the problem of determining exact optimal storage capacity for Web caches based on expected workload and the monetary costs of memory and bandwidth. The first approach considers memory/bandwidth tradeoffs in an idealized model. It assumes that workload consists of independent references drawn from a known distribution (e.g. Zipf) and caches employ a ''Perfect LFU'' removal policy. We derive conditions under which a shared higher-level ''parent'' cache serving several lower-level ''child'' caches is economically viable. We also characterize circumstances under which globally optimal storage capacities in such a hierarchy can be determined through a decentralized computation in which caches individually minimize local monetary expenditures. The second approach is applicable if the workload at a single cache is represented by an explicit request sequence and the cache employs any one of a large family of removal policies that includes LRU. The miss costs associated with individual requests may be completely arbitrary, and the cost of cache storage need only be monotonic. We use an efficient single-pass simulation algorithm to compute aggregate miss cost as a function of cache size in O(MlogM) time and O(M) memory, where M is the number of requests in the workload. Because it allows us to compute arbitrarily weighted hit rates at all cache sizes with modest computational resources, this algorithm permits us to measure cache performance with no loss of precision. The same basic algorithm also permits us to compute complete stack distance transformations in O(MlogN) time and O(N) memory, where N is the number of unique items referenced. Experiments on very large reference streams show that our algorithm computes stack distances more quickly than several alternative approaches, demonstrating that it is a useful tool for measuring temporal locality in cache workloads.