External Memory Algorithms

  • Authors:
  • Jeffrey Scott Vitter

  • Affiliations:
  • -

  • Venue:
  • ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/ output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this tutorial, we survey the state of the art in the design and analysis of external memory algorithms (also known as EM algorithms or out-of-core algorithms or I/O algorithms). External memory algorithms are often designed using the parallel disk model (PDM). The three machine-independent measures of an algorithm's performance in PDM are the number of I/O operations performed, the CPU time, and the amount of disk space used. PDM allows for multiple disks (or disk arrays) and parallel CPUs, and it can be generalized to handle cache hierarchies, hierarchical memory, and tertiary storage. We discuss a variety of problems and show how to solve them efficiently in external memory. Programming tools and environments are available for simplifying the programming task. Experiments on some newly developed algorithms for spatial databases incorporating these paradigms, implemented using TPIE (Transparent Parallel I/O programming Environment), show significant speedups over popular methods.