Tashkent+: memory-aware load balancing and update filtering in replicated databases

Authors:
Sameh Elnikety;Steven Dropsho;Willy Zwaenepoel
Affiliations:
School of Computer and Communication Sciences, EPFL, Switzerland;School of Computer and Communication Sciences, EPFL, Switzerland;School of Computer and Communication Sciences, EPFL, Switzerland
Venue:
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Year:
2007

Citing 32
Cited 14

The theory of database concurrency control

The theory of database concurrency control
Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
Implementing fault-tolerant services using the state machine approach: a tutorial

ACM Computing Surveys (CSUR)
Replica control in distributed systems: as asynchronous approach

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
A critique of ANSI SQL isolation levels

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The part-time parliament

ACM Transactions on Computer Systems (TOCS)
Locality-aware request distribution in cluster-based network servers

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
An analytical model of the working-set sizes in decision-support systems

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Evaluation of the basic remote backup and replication methods for high availability databases

Software—Practice & Experience
Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Application specific data replication for edge services

WWW '03 Proceedings of the 12th international conference on World Wide Web
Optimizing a 'Content-Aware' Load Balancing Strategy for Shared Web Hosting Service

MASCOTS '00 Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
A Suite of Database Replication Protocols based on Group Communication Primitives

ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Understanding Replication in Databases and Distributed Systems

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Lazy Database Replication with Ordering Guarantees

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
A method for transparent admission control and request scheduling in e-commerce web sites

Proceedings of the 13th international conference on World Wide Web
Ganymed: scalable replication for transactional web applications

Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Postgres-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Allocating isolation levels to transactions

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Middleware based data replication providing snapshot isolation

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database Replication Using Generalized Snapshot Isolation

SRDS '05 Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems
MIDDLE-R: Consistent database replication at the middleware level

ACM Transactions on Computer Systems (TOCS)
Lazy database replication with snapshot isolation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Mercury and freon: temperature emulation and management for server systems

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Tashkent: uniting durability with transaction ordering for high-performance scalable database replication

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Chain replication for supporting high throughput and availability

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Conflict-aware scheduling for dynamic content applications

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
HACC: an architecture for cluster-based web servers

WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
P*TIME: highly scalable OLTP DBMS for managing update-intensive stream workload

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Distributed versioning: consistent replication for scaling back-end databases of dynamic content web sites

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Caching dynamic web content: designing and analysing an aspect-oriented solution

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware

Conflict-aware load-balancing techniques for database replication

Proceedings of the 2008 ACM symposium on Applied computing
Middleware-based database replication: the gaps between theory and practice

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
dsmDB: a distributed shared memory approach for building replicated database systems

Proceedings of the 2nd workshop on Dependable distributed data management
Predicting replicated database scalability from standalone database profiling

Proceedings of the 4th ACM European conference on Computer systems
Reliable Communication Infrastructure for Adaptive Data Replication

OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Reviewing amnesia support in database recovery protocols

OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part I
Queue weighting load-balancing technique for database replication in dynamic content web sites

ACS'09 Proceedings of the 9th WSEAS international conference on Applied computer science
Efficient middleware for byzantine fault tolerant database replication

Proceedings of the sixth conference on Computer systems
Buffer cache de-duplication for query dispatch in replicated databases

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Transactional storage for geo-replicated systems

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
ParallelGDB: a parallel graph database based on cache specialization

Proceedings of the 15th Symposium on International Database Engineering & Applications
Regression-based resource provisioning for session slowdown guarantee in multi-tier Internet servers

Journal of Parallel and Distributed Computing
Database replication: a tutorial

Replication
A cost-based database request distribution technique for online e-commerce applications

MIS Quarterly

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute in main memory, thereby reducing disk I/O. In support of MALB, we introduce a method to estimate the size and the contents of transaction working sets. We also present an optimization called update filtering that reduces the overhead of update propagation between replicas. We show that MALB greatly improves performance over other load balancing techniques -- such as round robin, least connections, and locality-aware request distribution (LARD) -- that do not use explicit information on how transactions use memory. In particular, LARD demonstrates good performance for read-only static content Web workloads, but it gives performance inferior to MALB for database workloads as it does not efficiently handle large requests. MALB combined with update filtering further boosts performance over LARD. We build a prototype replicated system, called Tashkent+, with which we demonstrate that MALB and update filtering techniques improve performance of the TPC-W and RUBiS benchmarks. In particular, in a 16-replica cluster and using the ordering mix of TPC-W, MALB doubles the throughput over least connections and improves throughput 52% over LARD. MALB with update filtering further improves throughput to triple that of least connections and more than double that of LARD. Our techniques exhibit super-linear speedup; the throughput of the 16-replica cluster is 37 times the peak throughput of a standalone database due to better use of the cluster's memory.