Reconciling simplicity and realism in parallel disk models

  • Authors:
  • Peter Sanders

  • Affiliations:
  • Max Planck Institut für Informatik Saarbrücken, Germany

  • Venue:
  • SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2001

Quantified Score

Hi-index 0.01

Visualization

Abstract

For the design and analysis of algorithms that process huge data sets, a machine model is needed that handles parallel disks. There seems to be a dilemma between simple and flexible use of such a model and accurate modelling of details of the hardware. This paper explains how many aspects of this problem can be resolved. The programming model implements one large logical disk allowing concurrent access to arbitrary sets of variable size blocks. This model can be implemented efficienctly on multiple independent disks even if zones with different speed, communication bottlenecks and failed disks are allowed. These results not only provide useful algorithmic tools but also imply a theoretical justification for studying external memory algorithms using simple abstract models.The algorithmic approach is random redundant placement of data and optimal scheduling of accesses. The analysis generalizes a previous analysis for simple abstract external memory models in several ways (Higher efficiency, variable block sizes, more detailed disk model). As a side effect, an apparently new Chernoff bound for sums of weighted 0-1 random variables is derived.