Sorting Large Files on a Backend Multiprocessor

  • Authors:
  • Micah Beck;Dina Bitton;W. K Wilkinson

  • Affiliations:
  • -;-;-

  • Venue:
  • Sorting Large Files on a Backend Multiprocessor
  • Year:
  • 1986

Quantified Score

Hi-index 0.00

Visualization

Abstract

A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity and I/O bandwidth. In this paper, we investigate the feasibility and efficiency of a parallel sort-merge algorithm through implementation on the JASMIN prototype, a backend multiprocessor built around a fast packet bus. We describe the design and implementation of a parallel sort utility that may become a building block for query processing in a database system that runs on JASMIN. We present and analyze the results of measurements corresponding to a range of file sizes and processor configurations. Our results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three and five microprocessor configurations provide a very cost-effective sort of large files. The three processor configuration sorts a 100 megabyte file in one hour, which compares well with commercial sort packages available on high-performance mainframes. In additional experiments, we investigate a model to tune our sort software, and scale our results to higher processor and network capabilities.