Design and evaluation of parallel pipelined join algorithms

  • Authors:
  • James P. Richardson;Hongjun Lu;Krishna Mikkilineni

  • Affiliations:
  • Honeywell,Inc., Golden Valley, MN;Honeywell,Inc., Golden Valley, MN;Honeywell,Inc., Golden Valley, MN

  • Venue:
  • SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
  • Year:
  • 1987

Quantified Score

Hi-index 0.01

Visualization

Abstract

The join operation is the most costly operation in relational database management systems. Distributed and parallel processing can effectively speed up the join operation. In this paper, we describe a number of highly parallel and pipelined multiprocessor join algorithms using sort-merge and hashing techniques. Among them, two algorithms are parallel and pipelined versions of traditional sort-merge join methods, two algorithms use both hashing and sort-merge techniques, and another two are variations of the hybrid hash join algorithms. The performance of those algorithms is evaluated analytically against a generic database machine architecture. The methodology used in the design and evaluation of these algorithms is also discussed.The results of the analysis indicate that using a hashing technique to partition the source relations can dramatically reduce the elapsed time hash-based algorithms outperform sort-merge algorithms in almost all cases because of their high parallelism. Hash-based sort-merge and hybrid hash methods provide similar performance in most cases. With large source relations, the algorithms which replicate the smaller relation usually give better elapsed time. Sharing memory among processors also improves performance somewhat.