Document replication and distribution in extensible geographically distributed web servers

  • Authors:
  • Ling Zhuo;Cho-Li Wang;Francis C. M. Lau

  • Affiliations:
  • Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong

  • Venue:
  • Journal of Parallel and Distributed Computing - Scalable web services and architecture
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A geographically distributed web server (GDWS) system consists of multiple server nodes interconnected by a metropolitan area network (MAN) or a wide area network (WAN). It can achieve better efficiency in handling ever-increasing web requests than centralized web servers because its throughput will not be limited by available bandwidth connecting to a central server. The key research issue in the design of GDWS is how to replicate and distribute the documents of a website among the server nodes. This paper proposes a density-based replication scheme and applies it to our proposed Extensible GDWS architecture. We adopted a partial duplication scheme where document replication targets only at hot objects in a website. To distribute the replicas generated via the density-based replication scheme, we propose four different document distribution algorithms: Greedy-cost, Maximal-density, Greedy-penalty, and Proximity-aware. A proximity-based routing mechanism is designed to incorporate these algorithms for achieving better web server performance in a WAN environment. Simulation results show that the Greedy-penalty algorithm yields most stable load-balancing performance, and the Greedy-cost algorithm causes least internal traffic. Our scheme can achieve 80% of the performance of full-replication, with half the disk space.