Transparent caching with strong consistency in dynamic content web sites

  • Authors:
  • Cristiana Amza;Gokul Soundararajan;Emmanuel Cecchet

  • Affiliations:
  • Toronto, Canada;Toronto, Canada;INRIA, Rhone-Alpes, France

  • Venue:
  • Proceedings of the 19th annual international conference on Supercomputing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a cluster architecture in which dynamic content is generated by a database back-end and a collection of Web and application server front-ends. We study the effect of transparent query caching on the performance of such a cluster. Transparency requires that cached entries be invalidated as a result of writes. We start with a coarse-grain table-level automatic invalidation cache. Based on observed workload characteristics, we enhance the cache with the necessary dependency tracking and invalidations at the finer granularity of columns. Finally we reduce the miss penalty of invalidations through full and partial coverage of query results.In terms of system design, a query cache may be located at the database back-end, on dedicated machines, on the front-ends, or on a combination thereof. This paper evaluates the tradeoffs of the different cache designs and the cache location using the TPC-W benchmark.Our experiments show that our transparent query cache improves performance very substantially by up to a factor of 1.5 in throughput and 4.2 in response time overall compared to the baseline table-based invalidation scheme. An important contributor to this end result, our optimization for reducing the miss penalty through full and partial coverage detection of query results from the cache improves response time by up to a factor of 2.9 compared to a cache with fine-grained column-based invalidations alone. Thus, the benefits of the higher hit ratio in our optimizations outweigh the costs of additional processing. The results are less clear-cut in terms of where to locate the cache. Performance differences when varying the cache location and the number of caches are small.