Optimizing Main-Memory Join on Modern Hardware

Authors:
Stefan Manegold;Peter Boncz;Martin Kersten
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2002

Citing 30
Cited 26

Query processing in main memory database management systems

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Join indices

ACM Transactions on Database Systems (TODS)
Query optimization in a memory-resident domain relational calculus database system

ACM Transactions on Database Systems (TODS)
Fast search in main memory databases

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
AlphaSort: a RISC machine sort

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Tolerating latency through software-controlled data prefetching

Tolerating latency through software-controlled data prefetching
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture

Proceedings of the 25th annual international symposium on Computer architecture
In-memory data management for consumer transactions the timesten approach

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
NonStop SQL/MX primitives for knowledge discovery

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
On searching transposed files

ACM Transactions on Database Systems (TODS)
Performance analysis using the MIPS R10000 performance counters

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Smarter Memory: Improving Bandwidth for Streamed References

Computer
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
A Case for Intelligent RAM

IEEE Micro
Main Memory Database Systems: An Overview

IEEE Transactions on Knowledge and Data Engineering
PRISMA/DB: A Parallel, Main Memory Relational DBMS

IEEE Transactions on Knowledge and Data Engineering
Using Logarithmic Code-Expansion to Speedup Index Access and Maintenance

FOFO '89 Proceedings of the 3rd International Conference on Foundations of Data Organization and Algorithms
Flattening an Object Algebra to Provide Performance

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Main Memory Database Research Directions

IWDM '89 Proceedings of the Sixth International Workshop on Database Machines
A Study of Index Structures for Main Memory Database Management Systems

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Monet And Its Geographic Extensions: A Novel Approach to High Performance GIS Processing

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Cache Conscious Algorithms for Relational Query Processing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
MIL primitives for querying a fragmented world

The VLDB Journal — The International Journal on Very Large Data Bases
The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture

Multithreaded architectures and the sort benchmark

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Architecture-conscious hashing

DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Vector and matrix operations programmed with UDFs in a relational DBMS

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Compression techniques for fast external sorting

The VLDB Journal — The International Journal on Very Large Data Bases
Generic database cost models for hierarchical memory systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Adaptive aggregation on chip multiprocessors

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A general framework for improving query processing performance on multi-level memory hierarchies

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Breaking the memory wall in MonetDB

Communications of the ACM - Surviving the data deluge
Data partitioning on chip multiprocessors

Proceedings of the 4th international workshop on Data management on new hardware
Spinning relations: high-speed networks for distributed join processing

Proceedings of the Fifth International Workshop on Data Management on New Hardware
Cache-conscious buffering for database operators with state

Proceedings of the Fifth International Workshop on Data Management on New Hardware
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Proceedings of the VLDB Endowment
Database architecture evolution: mammals flourished long before dinosaurs became extinct

Proceedings of the VLDB Endowment
Process based application level architecture for RFID system

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Fast UDFs to compute sufficient statistics on large data sets exploiting caching and sampling

Data & Knowledge Engineering
Design and evaluation of main memory hash join algorithms for multi-core CPUs

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The data cyclotron query processing scheme

ACM Transactions on Database Systems (TODS)
The database architectures research group at CWI

ACM SIGMOD Record
MCJoin: a memory-constrained join for column-store main-memory databases

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
GPU join processing revisited

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Micro-specialization: dynamic code specialization of database management systems

Proceedings of the Tenth International Symposium on Code Generation and Optimization
Massively parallel sort-merge joins in main memory multi-core database systems

Proceedings of the VLDB Endowment
Efficient frequent item counting in multi-core hardware

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
LINQits: big data on little clients

Proceedings of the 40th Annual International Symposium on Computer Architecture
Memory footprint matters: efficient equi-join algorithms for main memory data processing

Proceedings of the 4th annual Symposium on Cloud Computing
Meet the walkers: accelerating index traversals for in-memory databases

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the past decade, the exponential growth in commodity CPU's speed has far outpaced advances in memory latency. A second trend is that CPU performance advances are not only brought by increased clock rate, but also by increasing parallelism inside the CPU. Current database systems have not yet adapted to these trends and show poor utilization of both CPU and memory resources on current hardware. In this paper, we show how these resources can be optimized for large joins and translate these insights into guidelines for future database architectures, encompassing data structures, algorithms, cost modeling, and implementation. In particular, we discuss how vertically fragmented data structures optimize cache performance on sequential data access. On the algorithmic side, we refine the partitioned hash-join with a new partitioning algorithm called radix-cluster, which is specifically designed to optimize memory access. The performance of this algorithm is quantified using a detailed analytical model that incorporates memory access costs in terms of a limited number of parameters, such as cache sizes and miss penalties. We also present a calibration tool that extracts such parameters automatically from any computer hardware. The accuracy of our models is proven by exhaustive experiments conducted with the Monet database system on three different hardware platforms. Finally, we investigate the effect of implementation techniques that optimize CPU resource usage. Our experiments show that large joins can be accelerated almost an order of magnitude on modern RISC hardware when both memory and CPU resources are optimized.