High-performance sorting on networks of workstations

  • Authors:
  • Andrea C. Arpaci-Dusseau;Remzi H. Arpaci-Dusseau;David E. Culler;Joseph M. Hellerstein;David A. Patterson

  • Affiliations:
  • Computer Science Division, University of California, Berkeley;-;Computer Science Division, University of California, Berkeley;Computer Science Division, University of California, Berkeley;Computer Science Division, University of California, Berkeley

  • Venue:
  • SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale SMPs that have traditionally held the performance records. On a 64-node cluster, we sort 6.0 GB in just under one minute, while a 32-node cluster finishes the Datamation benchmark in 2.41 seconds.Our implementations can be applied to a variety of disk, memory, and processor configurations; we highlight salient issues for tuning each component of the system. We evaluate the use of commodity operating systems and hardware for parallel sorting. We find existing OS primitives for memory management and file access adequate. Due to aggregate communication and disk bandwidth requirements, the bottleneck of our system is the workstation I/O bus.