Buffering databse operations for enhanced instruction cache performance

Authors:
Jingren Zhou;Kenneth A. Ross
Affiliations:
Columbia University;Columbia University
Venue:
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Year:
2004

Citing 26
Cited 26

Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Prefetching in supercomputer instruction caches

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Contrasting characteristics and cache performance of technical and multi-user commercial workloads

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Studies of Windows NT performance using dynamic execution traces

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Efficient procedure mapping using cache line coloring

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Cooperative prefetching: compiler and hardware support for effective instruction prefetching in modern processors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Procedure placement using temporal-ordering information

ACM Transactions on Programming Languages and Systems (TOPLAS)
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimizing multidimensional index trees for main memory access

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Code layout optimizations for transaction processing workloads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Fractal prefetching B+-Trees: optimizing both cache and disk performance

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
The Implementation of POSTGRES

IEEE Transactions on Knowledge and Data Engineering
Starburst Mid-Flight: As the Dust Clears

IEEE Transactions on Knowledge and Data Engineering
Block Oriented Processing of Relational Database Operations in Modern Computer Architectures

Proceedings of the 17th International Conference on Data Engineering
Cache Conscious Indexing for Decision-Support in Main Memory

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Optimization of Instruction Fetch for Decision Support Workloads

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Call Graph Prefetching for Database Applications

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Buffering accesses to memory-resident index structures

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

QPipe: a simultaneously pipelined relational query engine

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Improving database performance on simultaneous multithreading processors

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Bridging the gap between OLAP and SQL

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Performance tradeoffs in read-optimized databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Improving instruction cache performance in OLTP

ACM Transactions on Database Systems (TODS)
Improving query I/O performance by permuting and refining block request sequences

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Optimizing indirect branch prediction accuracy in virtual machine interpreters

ACM Transactions on Programming Languages and Systems (TOPLAS)
Steps towards cache-resident transaction processing

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
To share or not to share?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Parallel buffers for chip multiprocessors

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Vectorized data processing on the cell broadband engine

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Persistent clustered main memory index for accelerating k-NN queries on high dimensional datasets

Multimedia Tools and Applications
Column-stores vs. row-stores: how different are they really?

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Breaking the memory wall in MonetDB

Communications of the ACM - Surviving the data deluge
Temporal instruction fetch streaming

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Cache-conscious buffering for database operators with state

Proceedings of the Fifth International Workshop on Data Management on New Hardware
Efficiently support MapReduce-like computation models inside parallel DBMS

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Database architecture evolution: mammals flourished long before dinosaurs became extinct

Proceedings of the VLDB Endowment
MCC-DB: minimizing cache conflicts in multi-core processors for databases

Proceedings of the VLDB Endowment
Optimizing read convoys in main-memory query processing

Proceedings of the Sixth International Workshop on Data Management on New Hardware
Column-oriented query processing for row stores

Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Extend core UDF framework for GPU-enabled analytical query evaluation

Proceedings of the 15th Symposium on International Database Engineering & Applications
Proactive instruction fetch

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Optimization of query processing with cache conscious buffering operator

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Micro-specialization: dynamic code specialization of database management systems

Proceedings of the Tenth International Symposium on Code Generation and Optimization
RDIP: return-address-stack directed instruction prefetching

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

As more and more query processing work can be done in main memory access is becoming a significant cost component of database operations. Recent database research has shown that most of the memory stalls are due to second-level cache data misses and first-level instruction cache misses. While a lot of research has focused on reducing the data cache misses, relatively little research has been done on improving the instruction cache performance of database systems.We first answer the question "Why does a database system incur so many instruction cache misses?" We demonstrate that current demand-pull pipelined query execution engines suffer from significant instruction cache thrashing between different operators. We propose techniques to buffer database operations during query execution to avoid instruction cache thrashing. We implement a new light-weight "buffer" operator and study various factors which may affect the cache performance. We also introduce a plan refinement algorithm that considers the query plan and decides whether it is beneficial to add additional "buffer" operators and where to put them. The benefit is mainly from better instruction locality and better hardware branch prediction. Our techniques can be easily integrated into current database systems without significant changes. Our experiments in a memory-resident PostgreSQL database system show that buffering techniques can reduce the number of instruction cache misses by up to 80% and improve query performance by up to 15%.