Thin-client Web access patterns: Measurements from a cache-busting proxy

  • Authors:
  • Terence Kelly

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of Michigan, 1101 Beal Avenue, Ann Arbor, MI 48109-2110, USA

  • Venue:
  • Computer Communications
  • Year:
  • 2002

Quantified Score

Hi-index 0.24

Visualization

Abstract

This paper describes a new technique for measuring Web client request patterns and analyzes a large client trace collected using the new method. In this approach, a modified proxy intercepts requests and serves all responses to clients marked uncacheable, effectively disabling browser caches and allowing the proxy to record requests that would otherwise result in silent browser cache hits. WebTV Networks used a 'cache-busting proxy' to collect an unusually large and detailed anonymized Web client trace in September 2000. It contains over 347 million requests for over 36 million documents by over 37,000 clients and spans 16 days. By most measures, it is two orders of magnitude larger than existing Web client traces. We compare cache-busting proxies with conventional client instrumentation and use the WebTV trace to explore browser cache performance, reference locality, and document aliasing. We present the aggregate browser cache success function (hit rate vs. cache size) of the entire client population and discuss design implications for memory- and bandwidth-constrained Web clients. For the workload studied, eliminating redundant data transfers would increase browser cache hit rates by 35-45% over their current levels. A simple and practical technique for eliminating redundant transfers is described. Document sharing across client reference streams is so strong that the hit rate of a shared proxy cache could exceed 57% even if browser caches were infinitely large.