Join processing in database systems with large main memories
ACM Transactions on Database Systems (TODS)
An adaptive hash join algorithm for multiuser environments
Proceedings of the sixteenth international conference on Very large databases
Loop distribution with arbitrary control flow
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
AlphaSort: a RISC machine sort
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Tolerating latency through software-controlled data prefetching
Tolerating latency through software-controlled data prefetching
Compiler-based prefetching for recursive data structures
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
Automatic Compiler-Inserted Prefetching for Pointer-Based Applications
IEEE Transactions on Computers - Special issue on cache memory and related problems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 27th annual international symposium on Computer architecture
Performance analysis of the Alpha 21264-based Compaq ES40 system
Proceedings of the 27th annual international symposium on Computer architecture
Main-memory index structures with fixed-size partial keys
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Fractal prefetching B+-Trees: optimizing both cache and disk performance
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Database Architecture Optimized for the New Bottleneck: Memory Access
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
What Happens During a Join? Dissecting CPU and Memory Optimization Effects
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A systolic array optimizing compiler
A systolic array optimizing compiler
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Cache-Conscious Automata for XML Filtering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improving database performance on simultaneous multithreading processors
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Multithreaded architectures and the sort benchmark
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
A characterization of data mining algorithms on a modern processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Accelerating database operators using a network processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Hardware acceleration for database systems using content addressable memories
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Database hash-join algorithms on multithreaded computer architectures
Proceedings of the 3rd conference on Computing frontiers
Proceedings of the 33rd annual international symposium on Computer Architecture
Realizing parallelism in database operations: insights from a massively multithreaded architecture
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Architecture-conscious hashing
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Cache-conscious frequent pattern mining on modern and emerging processors
The VLDB Journal — The International Journal on Very Large Data Bases
Cache-oblivious nested-loop joins
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Improving hash join performance through prefetching
ACM Transactions on Database Systems (TODS)
Practical suffix tree construction
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Pipelined hash-join on multithreaded architectures
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
A general framework for improving query processing performance on multi-level memory hierarchies
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Architectural characterization of XQuery workloads on modern processors
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Cache-oblivious databases: Limitations and opportunities
ACM Transactions on Database Systems (TODS)
Exploiting multithreaded architectures to improve the hash join operation
Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Hash Join Optimization Based on Shared Cache Chip Multi-processor
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
Executing parallel TwigStack algorithm on a multi-core system
Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Performance improvement of join queries through algebraic signatures
International Journal of Intelligent Information and Database Systems
Optimization of joins using random record generation method
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Analyzing the effects of hyperthreading on the performance of data management systems
International Journal of Parallel Programming
Scalable aggregation on multicore processors
Proceedings of the Seventh International Workshop on Data Management on New Hardware
Vectorization vs. compilation in query execution
Proceedings of the Seventh International Workshop on Data Management on New Hardware
Fast computation of database operations using content-addressable memories
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Micro-specialization: dynamic code specialization of database management systems
Proceedings of the Tenth International Symposium on Code Generation and Optimization
SCOUT: prefetching for latent structure following queries
Proceedings of the VLDB Endowment
Vector Extensions for Decision Support DBMS Acceleration
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Hash join algorithms suffer from extensive CPU cachestalls. This paper shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicatedby the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, thatovercome these complications.These schemes achieve 2.0- 2.9X speedups for the join phase and 1.4-2.6X speedups forthe partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches(i.e. cache partitioning), the schemes are at least 50% fasteron large relations and do not require exclusive use of theCPU cache to be effective.