Shoring up persistent applications
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Exokernel: an operating system architecture for application-level resource management
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
On optimistic methods for concurrency control
ACM Transactions on Database Systems (TODS)
Multiversion concurrency control—theory and algorithms
ACM Transactions on Database Systems (TODS)
Towards robust distributed systems (abstract)
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Cache Fusion: Extending Shared-Disk Clusters with Shared Caches
Proceedings of the 27th International Conference on Very Large Data Bases
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Memory coherence activity prediction in commercial workloads
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
OLTP through the looking glass, and what we found there
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Communications of the ACM - Rural engineering development
Shore-MT: a scalable storage manager for the multicore era
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Reactive NUCA: near-optimal block placement and replication in distributed caches
Proceedings of the 36th annual international symposium on Computer architecture
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Improving OLTP scalability using speculative lock inheritance
Proceedings of the VLDB Endowment
Low overhead concurrency control for partitioned main memory databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A case for NUMA-aware contention management on multicore systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Schism: a workload-driven approach to database replication and partitioning
Proceedings of the VLDB Endowment
Aether: a scalable approach to logging
Proceedings of the VLDB Endowment
Data-oriented transaction execution
Proceedings of the VLDB Endowment
Database engines on multicores, why parallelize when you can distribute?
Proceedings of the sixth conference on Computer systems
Design and evaluation of main memory hash join algorithms for multi-core CPUs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The impact of memory subsystem resource sharing on datacenter applications
Proceedings of the 38th annual international symposium on Computer architecture
HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
PLP: page latch-free shared-everything OLTP
Proceedings of the VLDB Endowment
On predictive modeling for optimizing transaction execution in parallel OLTP systems
Proceedings of the VLDB Endowment
High-performance concurrency control mechanisms for main-memory databases
Proceedings of the VLDB Endowment
Scalable and dynamically balanced shared-everything OLTP with physiological partitioning
The VLDB Journal — The International Journal on Very Large Data Bases
From A to E: analyzing TPC's OLTP benchmarks: the obsolete, the ubiquitous, the unexplored
Proceedings of the 16th International Conference on Extending Database Technology
OLTP in wonderland: where do cache misses come from in major OLTP components?
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Speedy transactions in multicore in-memory databases
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Next generation data analytics at IBM research
Proceedings of the VLDB Endowment
Eliminating unscalable communication in transaction processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Modern hardware is abundantly parallel and increasingly heterogeneous. The numerous processing cores have nonuniform access latencies to the main memory and to the processor caches, which causes variability in the communication costs. Unfortunately, database systems mostly assume that all processing cores are the same and that microarchitecture differences are not significant enough to appear in critical database execution paths. As we demonstrate in this paper, however, hardware heterogeneity does appear in the critical path and conventional database architectures achieve suboptimal and even worse, unpredictable performance. We perform a detailed performance analysis of OLTP deployments in servers with multiple cores per CPU (multicore) and multiple CPUs per server (multisocket). We compare different database deployment strategies where we vary the number and size of independent database instances running on a single server, from a single shared-everything instance to fine-grained shared-nothing configurations. We quantify the impact of non-uniform hardware on various deployments by (a) examining how efficiently each deployment uses the available hardware resources and (b) measuring the impact of distributed transactions and skewed requests on different workloads. Finally, we argue in favor of shared-nothing deployments that are topology- and workload-aware and take advantage of fast on-chip communication between islands of cores on the same socket.