Cache-conscious buffering for database operators with state

Authors:
John Cieslewicz;William Mee;Kenneth A. Ross
Affiliations:
Columbia University;Columbia University;Columbia University
Venue:
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Year:
2009

Citing 20
Cited 5

Quickly generating billion-record synthetic databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Memory allocation strategies for complex decision support queries

Proceedings of the seventh international conference on Information and knowledge management
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimizing Main-Memory Join on Modern Hardware

IEEE Transactions on Knowledge and Data Engineering
Block Oriented Processing of Relational Database Operations in Modern Computer Architectures

Proceedings of the 17th International Conference on Data Engineering
Cache Conscious Indexing for Decision-Support in Main Memory

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
MIL primitives for querying a fragmented world

The VLDB Journal — The International Journal on Very Large Data Bases
Buffering databse operations for enhanced instruction cache performance

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Improving database performance on simultaneous multithreading processors

VLDB '05 Proceedings of the 31st international conference on Very large data bases
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cache-conscious frequent pattern mining on a modern processor

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Improving instruction cache performance in OLTP

ACM Transactions on Database Systems (TODS)
Operating System Concepts

Operating System Concepts
Buffering accesses to memory-resident index structures

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Cache-conscious radix-decluster projections

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Sybase IQ multiplex - designed for analytics

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Adaptive aggregation on chip multiprocessors

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Parallel buffers for chip multiprocessors

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware

Flash-optimized B+-tree

Journal of Computer Science and Technology
Design and evaluation of main memory hash join algorithms for multi-core CPUs

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Designing fast architecture-sensitive tree search on modern multicore/many-core processors

ACM Transactions on Database Systems (TODS)
Optimization of query processing with cache conscious buffering operator

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
A comparison of the use of virtual versus physical snapshots for supporting update-intensive workloads

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware

Quantified Score

Hi-index	0.00

Visualization

Abstract

Database processes must be cache-efficient to effectively utilize modern hardware. In this paper, we analyze the importance of temporal locality and the resultant cache behavior in scheduling database operators for in-memory, block oriented query processing. We demonstrate how the overall performance of a workload of multiple database operators is strongly dependent on how they are interleaved with each other. Longer time slices combined with temporal locality within an operator amortize the effects of the initial compulsory cache misses needed to load the operator's state, such as a hash table, into the cache. Though running an operator to completion over all of its input results in the greatest amortization of cache misses, this is typically infeasible because of the large intermediate storage requirement to materialize all input tuples to an operator. We show experimentally that good cache performance can be obtained with smaller buffers whose size is determined at runtime. We demonstrate a low-overhead method of runtime cache miss sampling using hardware performance counters. Our evaluation considers two common database operators with state: aggregation and hash join. Sampling reveals operator temporal locality and cache miss behavior, and we use those characteristics to choose an appropriate input buffer/block size. The calculated buffer size balances cache miss amortization with buffer memory requirements.