A parallel pipelined relational query processor
ACM Transactions on Database Systems (TODS)
Hash-based join algorithms for multiprocessor computers with shared memory
Proceedings of the sixteenth international conference on Very large databases
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An analysis of database workload performance on simultaneous multithreaded processors
Proceedings of the 25th annual international symposium on Computer architecture
A relational model of data for large shared data banks
Communications of the ACM
Computer
An Overview of The System Software of A Parallel Relational Database Machine GRACE
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What Happens During a Join? Dissecting CPU and Memory Optimization Effects
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Improving server software support for simultaneous multithreaded processors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Architectural considerations for parallel query evaluation algorithms
Architectural considerations for parallel query evaluation algorithms
Improving Hash Join Performance through Prefetching
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Effectively sharing a cache among threads
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Improving database performance on simultaneous multithreading processors
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimistic intra-transaction parallelism on chip multiprocessors
VLDB '05 Proceedings of the 31st international conference on Very large data bases
DBmbench: fast and accurate database workload representation on modern microarchitecture
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Database hash-join algorithms on multithreaded computer architectures
Proceedings of the 3rd conference on Computing frontiers
Realizing parallelism in database operations: insights from a massively multithreaded architecture
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Architecture-conscious hashing
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Parallel depth first vs. work stealing schedulers on CMP architectures
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Incrementally parallelizing database transactions with thread-level speculation
ACM Transactions on Computer Systems (TOCS)
An evaluation of OpenMP on current and emerging multithreaded/multicore processors
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Hi-index | 0.00 |
As database management systems gain importance in our everyday life, it is essential to have efficient implementations of important database operations such as the hash join. Improvements in processor architectures including simultaneous multithreaded architectures and Chip Multiprocessors have opened opportunities for taking advantage of the new multithreaded hardware. Recently, several efforts have been done to enhance database performance through architecture-aware data management. In this paper, we present a new architecture-aware hash join (AA_HJ) algorithm for main memory database systems, where all the data resides in memory. AA_HJ relies on sharing critical structures at the cache level, and distributing the load evenly between threads. Our timing results show a performance improvement up to 2.9x for the Intel® Pentium® 4 HT and up to 4.6x on the Intel® Quad Xeon® Dual-Core machine, compared to single-threaded hash join. The L2 load miss rate is reduced by up 82%.