Scalability of write-ahead logging on multicore and multisocket hardware

Authors:
Ryan Johnson;Ippokratis Pandis;Radu Stoica;Manos Athanassoulis;Anastasia Ailamaki
Affiliations:
Department of Computer Science, University of Toronto, Toronto, Canada;IBM Almaden Research Center, San Jose, USA;School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2012

Citing 25
Cited 4

Distributed logging for transaction processing

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging

ACM Transactions on Database Systems (TODS)
Shoring up persistent applications

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Elimination trees and the construction of pools and stacks: preliminary version

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Non-blocking timeout in scalable queue-based spin locks

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Partial Strictness in Two-Phase Locking

ICDT '95 Proceedings of the 5th International Conference on Database Theory
Group Commit Timers and High Volume Transaction Systems

Proceedings of the 2nd International Workshop on High Performance Transaction Systems
ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes

VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Cache Fusion: Extending Shared-Disk Clusters with Shared Caches

Proceedings of the 27th International Conference on Very Large Data Bases
A scalable lock-free stack algorithm

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Using elimination to implement scalable and lock-free FIFO queues

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Dynamic instrumentation of production systems

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
The end of an architectural era: (it's time for a complete rewrite)

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
OLTP through the looking glass, and what we found there

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A case for flash memory ssd in enterprise database applications

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Shore-MT: a scalable storage manager for the multicore era

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
FlashLogging: exploiting flash devices for synchronous logging performance

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Improving OLTP scalability using speculative lock inheritance

Proceedings of the VLDB Endowment
The case for determinism in database systems

Proceedings of the VLDB Endowment
Aether: a scalable approach to logging

Proceedings of the VLDB Endowment
Data-oriented transaction execution

Proceedings of the VLDB Endowment

A scalable lock manager for multicores

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
MLC-flash-friendly logging and recovery for databases

Proceedings of the 28th Annual ACM Symposium on Applied Computing
bCATE: a balanced contention-aware transaction execution model for highly concurrent OLTP systems

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Toward scalable transaction processing: evolution of shore-MT

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

The shift to multi-core and multi-socket hardware brings new challenges to database systems, as the software parallelism determines performance. Even though database systems traditionally accommodate simultaneous requests, a multitude of synchronization barriers serialize execution. Write-ahead logging is a fundamental, omnipresent component in ARIES-style concurrency and recovery, and one of the most important yet-to-be addressed potential bottlenecks, especially in OLTP workloads making frequent small changes to data. In this paper, we identify four logging-related impediments to database system scalability. Each issue challenges different level in the software architecture: (a) the high volume of small-sized I/O requests may saturate the disk, (b) transactions hold locks while waiting for the log flush, (c) extensive context switching overwhelms the OS scheduler with threads executing log I/Os, and (d) contention appears as transactions serialize accesses to in-memory log data structures. We demonstrate these problems and address them with techniques that, when combined, comprise a holistic, scalable approach to logging. Our solution achieves a 20---69% speedup over a modern database system when running log-intensive workloads, such as the TPC-B and TATP benchmarks, in a single-socket multiprocessor server. Moreover, it achieves log insert throughput over 2.2 GB/s for small log records on the single-socket server, roughly 20 times higher than the traditional way of accessing the log using a single mutex. Furthermore, we investigate techniques on scaling the performance of logging to multi-socket servers. We present a set of optimizations which partly ameliorate the latency penalty that comes with multi-socket hardware, and then we investigate the feasibility of applying a distributed log buffer design at the socket level.