A Comparative Evaluation of Transparent Scaling Techniques for Dynamic Content Servers

Authors:
C. Amza;A. L. Cox;W. Zwaenepoel
Affiliations:
University of Toronto;Rice University;EPFL
Venue:
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Year:
2005

Citing 27
Cited 21

Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Replication, consistency, and practicality: are these mutually exclusive?

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Locality-aware request distribution in cluster-based network servers

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Decentralized replicated-object protocols

Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
Anatomy of a real E-commerce system

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A new approach to developing and implementing eager database replication protocols

ACM Transactions on Database Systems (TODS)
A scalable and highly available system for serving dynamic data at frequently accessed web sites

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
The Informix Handbook with Cdrom

The Informix Handbook with Cdrom
Middle-tier database caching for e-business

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Scalable Replication in Database Clusters

DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Application specific data replication for edge services

WWW '03 Proceedings of the 12th international conference on World Wide Web
Database Replication Techniques: A Three Parameter Classification

SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
Using Broadcast Primitives in Replicated Databases

ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
User-Level Communication in Cluster-Based Servers

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
A method for transparent admission control and request scheduling in e-commerce web sites

Proceedings of the 13th international conference on World Wide Web
Ganymed: scalable replication for transactional web applications

Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Adaptive middleware for data replication

Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
C-JDBC: flexible database clustering middleware

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Design and evaluation of a continuous consistency model for replicated services

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Neptune: scalable replication management and programming support for cluster-based network services

USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3
Adaptive overload control for busy internet servers

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Conflict-aware scheduling for dynamic content applications

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
HACC: an architecture for cluster-based web servers

WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
FAS: a freshness-sensitive coordination middleware for a cluster of OLAP components
Distributed versioning: consistent replication for scaling back-end databases of dynamic content web sites

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware

Postgres-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Transparent caching with strong consistency in dynamic content web sites

Proceedings of the 19th annual international conference on Supercomputing
Exploiting distributed version concurrency in a transactional memory cluster

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Reactive provisioning of backend databases in shared dynamic content server clusters

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Database replication policies for dynamic content applications

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Performance impacts of autocorrelated flows in multi-tiered systems

Performance Evaluation
Online recovery in cluster databases

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Conflict-aware load-balancing techniques for database replication

Proceedings of the 2008 ACM symposium on Applied computing
Middleware-based database replication: the gaps between theory and practice

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Extending DBMSs with satellite databases

The VLDB Journal — The International Journal on Very Large Data Bases
Controlling the Behaviour of Database Servers with 2PAC and DiffServ

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
DBFarm: a scalable cluster for multiple databases

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
Synergy: sharing-aware component composition for distributed stream processing systems

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
Symmetric active/active metadata service for high availability parallel file systems

Journal of Parallel and Distributed Computing
Reviewing amnesia support in database recovery protocols

OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part I
DBFarm: a scalable cluster for multiple databases

Middleware'06 Proceedings of the 7th ACM/IFIP/USENIX international conference on Middleware
Synergy: sharing-aware component composition for distributed stream processing systems

Middleware'06 Proceedings of the 7th ACM/IFIP/USENIX international conference on Middleware
Database replication: a tutorial

Replication
Gumball: a race condition prevention technique for cache augmented SQL database management systems

DBSocial '12 Proceedings of the 2nd ACM SIGMOD Workshop on Databases and Social Networks
Hierarchical availability analysis of multi-tiered Web applications

Software Quality Control
A comparison of two physical data designs for interactive social networking actions

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong data consistency as perceived by the user, nomodifications to existing dynamic content site tiers and no additional programming effort from the user or site administrator upon deployment. We study strategies for scheduling and load balancing queries on a cluster of replicated database back-ends. We also investigate transparent query caching as a means of enhancing database replication. Our work shows that, on an experimental platform with up to 8 database replicas, the various techniques work in synergy to improve overall scaling for the e-commerce TPCW benchmark. We rank the techniques necessary for high performance in order of impact as follows. Key among the strategies are scheduling strategies, such as conflict-aware scheduling, that minimize consistency maintainance over-heads. The choice of load balancing strategy is less important. Transparent query result caching increases performance significantly at any given cluster size for a mostly-read workload. Its benefits are limited for write-intensive workloads, where content-aware scheduling is the only scaling option.