Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning
IEEE Transactions on Knowledge and Data Engineering
Frequency-adaptive join for shared nothing machines
Progress in computer research
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
DEXA '99 Proceedings of the 10th International Conference on Database and Expert Systems Applications
Hi-index | 0.00 |
There has been a wealth of research in the area of parallel join algorithms. Among them, hash-based algorithms are particularly suitable for shared-nothing database systems. The effectiveness of these techniques depends on the uniformity in the distribution of the join attribute values. When this condition is not met, a severe fluctuation may occur among the bucket sizes, causing uneven workload for the processing nodes. Many parallel join algorithms with load balancing capability have been proposed to address this problem. Among them, the sampling and incremental approaches have been shown to provide an improvement over the more conventional methods. The comparison between these two approaches, however, has not been investigated. In this paper, we improve these techniques and implement them on an nCUBE/2 parallel computer to compare their performance. Our study indicates that the sampling technique is the better approach.