Data allocation in distributed database systems
ACM Transactions on Database Systems (TODS)
Knapsack problems: algorithms and computer implementations
Knapsack problems: algorithms and computer implementations
Competitive distributed file allocation
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Web server workload characterization: the search for invariants
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An adaptive data replication algorithm
ACM Transactions on Database Systems (TODS)
Locality-aware request distribution in cluster-based network servers
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The content and access dynamics of a busy Web site: findings and implications
Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
Comparative Models of the File Assignment Problem
ACM Computing Surveys (CSUR)
Distributed cooperative Apache web server
Proceedings of the 10th international conference on World Wide Web
Geographic Load Balancing for Scalable Distributed Web Systems
MASCOTS '00 Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Data Distribution Algorithms For Load Balanced Fault-Tolerant Web Access
SRDS '97 Proceedings of the 16th Symposium on Reliable Distributed Systems
Scalable Web Server Design for Distributed Data Management
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
An Effective Mechanism for Supporting Content-Based Routing in Scalable Web Server Clusters
ICPP '99 Proceedings of the 1999 International Workshops on Parallel Processing
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Load Balancing in Distributed Web Server Systems with Partial Document Replication
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Caching Proxies: Limitations and Potentials
Caching Proxies: Limitations and Potentials
Design and evaluation of data allocation algorithms for distributed multimedia database systems
IEEE Journal on Selected Areas in Communications
Wide-area Internet traffic patterns and characteristics
IEEE Network: The Magazine of Global Internetworking
Allocating Fragments in Distributed Databases
IEEE Transactions on Parallel and Distributed Systems
A quantitative justification to partial replication of web contents
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part IV
Document replication strategies for geographically distributed web search engines
Information Processing and Management: an International Journal
Hi-index | 0.00 |
A geographically distributed web server (GDWS) system consists of multiple server nodes interconnected by a metropolitan area network (MAN) or a wide area network (WAN). It can achieve better efficiency in handling ever-increasing web requests than centralized web servers because its throughput will not be limited by available bandwidth connecting to a central server. The key research issue in the design of GDWS is how to replicate and distribute the documents of a website among the server nodes. This paper proposes a density-based replication scheme and applies it to our proposed Extensible GDWS architecture. We adopted a partial duplication scheme where document replication targets only at hot objects in a website. To distribute the replicas generated via the density-based replication scheme, we propose four different document distribution algorithms: Greedy-cost, Maximal-density, Greedy-penalty, and Proximity-aware. A proximity-based routing mechanism is designed to incorporate these algorithms for achieving better web server performance in a WAN environment. Simulation results show that the Greedy-penalty algorithm yields most stable load-balancing performance, and the Greedy-cost algorithm causes least internal traffic. Our scheme can achieve 80% of the performance of full-replication, with half the disk space.