OLTP in wonderland: where do cache misses come from in major OLTP components?

Authors:
Pınar Tözün;Brian Gold;Anastasia Ailamaki
Affiliations:
EPFL;Oracle Labs;EPFL
Venue:
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Year:
2013

Citing 23
Cited 0

Evaluating Associativity in CPU Caches

IEEE Transactions on Computers
ARIES/IM: an efficient and high concurrency index management method using write-ahead logging

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance of database workloads on shared-memory systems with out-of-order processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Code layout optimizations for transaction processing workloads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Fractal prefetching B+-Trees: optimizing both cache and disk performance

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
A performance counter architecture for computing accurate CPI components

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Computation spreading: employing hardware migration to specialize CMP cores on-the-fly

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Cache-conscious frequent pattern mining on modern and emerging processors

The VLDB Journal — The International Journal on Very Large Data Bases
The end of an architectural era: (it's time for a complete rewrite)

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
OLTP through the looking glass, and what we found there

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Shore-MT: a scalable storage manager for the multicore era

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Improving OLTP scalability using speculative lock inheritance

Proceedings of the VLDB Endowment
Aether: a scalable approach to logging

Proceedings of the VLDB Endowment
Data-oriented transaction execution

Proceedings of the VLDB Endowment
PLP: page latch-free shared-everything OLTP

Proceedings of the VLDB Endowment
Clearing the clouds: a study of emerging scale-out workloads on modern hardware

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Proactive instruction fetch

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
OLTP on hardware islands

Proceedings of the VLDB Endowment
From A to E: analyzing TPC's OLTP benchmarks: the obsolete, the ubiquitous, the unexplored

Proceedings of the 16th International Conference on Extending Database Technology
SLICC: Self-Assembly of Instruction Cache Collectives for OLTP Workloads

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

For several decades, online transaction processing has been one of the main applications that drives innovations in the data management ecosystem, and in turn the database and computer architecture communities. Despite the novel approaches from industry and various research proposals from academia, recent studies emphasize that OLTP workloads still cannot exploit the full capability of modern processors. To better integrate OLTP and hardware in future systems, we perform a detailed analysis of instruction and data misses, the main causes of memory stalls. We demonstrate which operations and components of a typical storage manager cause the majority of different types of misses in each level of the memory hierarchy on a configuration that closely represents modern commodity hardware. We also observe the impact of data working set size on these misses. According to our experimental results, L1 instruction misses are an extensive cause of the overall stall time for OLTP even for data working set sizes as large as 100GB as long as the data fits in memory. Capacity misses coming from the index probe operation are the dominant cause of the instruction and data stalls when running typical OLTP workloads. During index probe (one of the most common operations in OLTP), the B-tree, lock, and buffer management components of a storage manager are responsible for more than half of the total misses.