Document replication and distribution in extensible geographically distributed web servers

Authors:
Ling Zhuo;Cho-Li Wang;Francis C. M. Lau
Affiliations:
Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong
Venue:
Journal of Parallel and Distributed Computing - Scalable web services and architecture
Year:
2003

Citing 20
Cited 3

Data allocation in distributed database systems

ACM Transactions on Database Systems (TODS)
Knapsack problems: algorithms and computer implementations

Knapsack problems: algorithms and computer implementations
Competitive distributed file allocation

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Web server workload characterization: the search for invariants

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An adaptive data replication algorithm

ACM Transactions on Database Systems (TODS)
Locality-aware request distribution in cluster-based network servers

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The content and access dynamics of a busy Web site: findings and implications

Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
Comparative Models of the File Assignment Problem

ACM Computing Surveys (CSUR)
Distributed cooperative Apache web server

Proceedings of the 10th international conference on World Wide Web
Cost-Based Program Allocation for Distributed Multimedia-on-Demand Systems

IEEE MultiMedia
NCSA's World Wide Web Server: Design and Performance

Computer
Geographic Load Balancing for Scalable Distributed Web Systems

MASCOTS '00 Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Data Distribution Algorithms For Load Balanced Fault-Tolerant Web Access

SRDS '97 Proceedings of the 16th Symposium on Reliable Distributed Systems
Scalable Web Server Design for Distributed Data Management

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
An Effective Mechanism for Supporting Content-Based Routing in Scalable Web Server Clusters

ICPP '99 Proceedings of the 1999 International Workshops on Parallel Processing
Static and Adaptive Data Replication Algorithms for Fast Information Access in Large Distributed Systems

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Load Balancing in Distributed Web Server Systems with Partial Document Replication

ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Caching Proxies: Limitations and Potentials

Caching Proxies: Limitations and Potentials
Design and evaluation of data allocation algorithms for distributed multimedia database systems

IEEE Journal on Selected Areas in Communications
Wide-area Internet traffic patterns and characteristics

IEEE Network: The Magazine of Global Internetworking

Allocating Fragments in Distributed Databases

IEEE Transactions on Parallel and Distributed Systems
A quantitative justification to partial replication of web contents

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part IV
Document replication strategies for geographically distributed web search engines

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

A geographically distributed web server (GDWS) system consists of multiple server nodes interconnected by a metropolitan area network (MAN) or a wide area network (WAN). It can achieve better efficiency in handling ever-increasing web requests than centralized web servers because its throughput will not be limited by available bandwidth connecting to a central server. The key research issue in the design of GDWS is how to replicate and distribute the documents of a website among the server nodes. This paper proposes a density-based replication scheme and applies it to our proposed Extensible GDWS architecture. We adopted a partial duplication scheme where document replication targets only at hot objects in a website. To distribute the replicas generated via the density-based replication scheme, we propose four different document distribution algorithms: Greedy-cost, Maximal-density, Greedy-penalty, and Proximity-aware. A proximity-based routing mechanism is designed to incorporate these algorithms for achieving better web server performance in a WAN environment. Simulation results show that the Greedy-penalty algorithm yields most stable load-balancing performance, and the Greedy-cost algorithm causes least internal traffic. Our scheme can achieve 80% of the performance of full-replication, with half the disk space.