A scalable and highly available system for serving dynamic data at frequently accessed web sites

  • Authors:
  • Jim Challenger;Paul Dantzig;Arun Iyengar

  • Affiliations:
  • IBM Research, Yorktown Heights, NY;IBM Research, Yorktown Heights, NY;IBM Research, Yorktown Heights, NY

  • Venue:
  • SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the system and key techniques used for achieving performance and high availability at the official Web site for the 1998 Olympic Winter Games which was one of the most popular Web sites for the duration of the Olympic Games. The Web site utilized thirteen SP2 systems scattered around the globe containing a total of 143 processors. A key feature of the Web site was that the data being presented to clients was constantly changing. Whenever new results were entered into the system, updated Web pages reflecting the changes were made available to the rest of the world within seconds.One technique we used to serve dynamic data efficiently to clients was to cache dynamic pages so that they only had to be generated once. We developed and implemented a new algorithm we call Data Update Propagation (DUP) which identifies the cached pages that have become stale as a result of changes to underlying data on which the cached pages depend, such as databases. For the Olympic Games Web site, we were able to update stale pages directly in the cache which obviated the need to invalidate them. This allowed us to achieve cache hit rates of close to 100%.Our system was able to serve pages to clients quickly during the entire Olympic Games even during peak periods. In addition, the site was available 100% of the time. We describe the key features employed by our site for high availability. We also describe how the Web site was structured to provide useful information while requiring clients to examine only a small number of pages.