SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
On parallel execution of multiple pipelined hash joins
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
An adaptive query execution system for data integration
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins
IEEE Transactions on Knowledge and Data Engineering
Centralized Architecture for Parallel Query Processing on Networks of Workstations
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Distributed Parallel Query Processing on Networks of Workstations
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
An Adaptive Hash Join Algorithm on a Network of Workstations
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A PC-NOW Based Parallel Extension for a Sequential DBMS
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Hash Joins and Hash Teams in Microsoft SQL Server
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Using a Network of Workstations to Enhance Database Query Processing Performance
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hierarchical Architecture for Parallel Query Processing on Networks of Workstations
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
Performance Evaluation of Nested-Loop Join Processing on Networks of Workstations
ICPADS '00 Proceedings of the Seventh International Conference on Parallel and Distributed Systems
An adaptive load balancing algorithm for large data parallel processing with communication delay
ICCS'03 Proceedings of the 2003 international conference on Computational science
Parallel hash join algorithms for dynamic load balancing in a shared disks cluster
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Hi-index | 0.00 |
The traditional hash join algorithm uses a single hash table built on one of the relations participating in the join operation. A variation called double hash join was proposed to remedy some of the performance problems with the single join. In this paper, we compare the performance of single- and double-pipelined hash joins in a cluster environment. In this environment, nodes are heterogeneous; furthermore, nodes experience dynamic, non-query local background load that can impact the pipelined query execution performance. Previous studies have shown that double-pipelined hash join performs substantially better than the single-pipelined hash join when dealing with data from remote sources. However, their relative performance has not been studied in cluster environments. Our study indicates that, in the type of cluster environments we consider here, single pipelined hash join performs as well as or better than the double pipelined hash join in most cases. We present experimental results on a Pentium cluster and identify these cases.