Improving Hash Join Performance through Prefetching

Authors:
Shimin Chen;Anastassia Ailamaki;Phillip B. Gibbons;Todd C. Mowry
Affiliations:
-;-;-;-
Venue:
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Year:
2004

Citing 25
Cited 34

Join processing in database systems with large main memories

ACM Transactions on Database Systems (TODS)
An adaptive hash join algorithm for multiuser environments

Proceedings of the sixteenth international conference on Very large databases
Loop distribution with arbitrary control flow

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
An effective on-chip preloading scheme to reduce data access penalty

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
AlphaSort: a RISC machine sort

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Tolerating latency through software-controlled data prefetching

Tolerating latency through software-controlled data prefetching
Compiler-based prefetching for recursive data structures

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Automatic Compiler-Inserted Prefetching for Pointer-Based Applications

IEEE Transactions on Computers - Special issue on cache memory and related problems
Is SC + ILP = RC?

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Recency-based TLB preloading

Proceedings of the 27th annual international symposium on Computer architecture
Performance analysis of the Alpha 21264-based Compaq ES40 system

Proceedings of the 27th annual international symposium on Computer architecture
Main-memory index structures with fixed-size partial keys

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Fractal prefetching B+-Trees: optimizing both cache and disk performance

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
What Happens During a Join? Dissecting CPU and Memory Optimization Effects

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A systolic array optimizing compiler

A systolic array optimizing compiler

A Computational Database System for Generatinn Unstructured Hexahedral Meshes with Billions of Elements

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Cache-Conscious Automata for XML Filtering

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improving database performance on simultaneous multithreading processors

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cache-conscious frequent pattern mining on a modern processor

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Inspector joins

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Multithreaded architectures and the sort benchmark

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
A characterization of data mining algorithms on a modern processor

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Accelerating database operators using a network processor

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Hardware acceleration for database systems using content addressable memories

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Database hash-join algorithms on multithreaded computer architectures

Proceedings of the 3rd conference on Computing frontiers
Spatial Memory Streaming

Proceedings of the 33rd annual international symposium on Computer Architecture
Realizing parallelism in database operations: insights from a massively multithreaded architecture

DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Architecture-conscious hashing

DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Cache-conscious frequent pattern mining on modern and emerging processors

The VLDB Journal — The International Journal on Very Large Data Bases
Cache-oblivious nested-loop joins

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Improving hash join performance through prefetching

ACM Transactions on Database Systems (TODS)
Practical suffix tree construction

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Pipelined hash-join on multithreaded architectures

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
A general framework for improving query processing performance on multi-level memory hierarchies

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Architectural characterization of XQuery workloads on modern processors

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Cache-oblivious databases: Limitations and opportunities

ACM Transactions on Database Systems (TODS)
Exploiting multithreaded architectures to improve the hash join operation

Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Hash Join Optimization Based on Shared Cache Chip Multi-processor

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Proceedings of the VLDB Endowment
Executing parallel TwigStack algorithm on a multi-core system

Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Performance improvement of join queries through algebraic signatures

International Journal of Intelligent Information and Database Systems
Optimization of joins using random record generation method

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Analyzing the effects of hyperthreading on the performance of data management systems

International Journal of Parallel Programming
Scalable aggregation on multicore processors

Proceedings of the Seventh International Workshop on Data Management on New Hardware
Vectorization vs. compilation in query execution

Proceedings of the Seventh International Workshop on Data Management on New Hardware
Fast computation of database operations using content-addressable memories

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Micro-specialization: dynamic code specialization of database management systems

Proceedings of the Tenth International Symposium on Code Generation and Optimization
SCOUT: prefetching for latent structure following queries

Proceedings of the VLDB Endowment
Vector Extensions for Decision Support DBMS Acceleration

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hash join algorithms suffer from extensive CPU cachestalls. This paper shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicatedby the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, thatovercome these complications.These schemes achieve 2.0- 2.9X speedups for the join phase and 1.4-2.6X speedups forthe partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches(i.e. cache partitioning), the schemes are at least 50% fasteron large relations and do not require exclusive use of theCPU cache to be effective.