A self-managing data cache for edge-of-network web applications

  • Authors:
  • Khalil Amiri;Sanghyun Park;Renu Tewari

  • Affiliations:
  • IBM T. J. Watson Research Center, Hawthorne, NY;Pohang University of Science and Technology, Pohang, Korea;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • Proceedings of the eleventh international conference on Information and knowledge management
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database caching at proxy servers enables dynamic content to be generated at the edge of the network, thereby improving the scalability and response time of web applications. The scale of deployment of edge servers coupled with the rising costs of their administration demand that such caching middleware be adaptive and self-managing. To achieve this, a cache must be dynamically populated and pruned based on the application query stream and access pattern. In this paper, we describe such a cache which maintains a large number of materialized views of previous query results. Cached "views" share physical storage to avoid redundancy, and are usually added and evicted dynamically to adapt to the current workload and to available resources. These two properties of large scale (large number of cached views) and overlapping storage introduce several challenges to query matching and storage management which are not addressed by traditional approaches. In this paper, we describe an edge data cache architecture with a flexible query matching algorithm and a novel storage management policy which work well in such an environment. We perform an evaluation of a prototype of such an architecture using the TPC-W benchmark and find that it reduces query response times by up to 75%, while reducing network and server load.