A data management approach for handling large compressed arrays in high performance computing

  • Authors:
  • K. E. Seamens;M. Winslett

  • Affiliations:
  • -;-

  • Venue:
  • FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Poor parallel i/o performance has recently been recognized as a roadblock to scalability of parallel architectures, algorithms, and data sets. For i/o of large arrays, the storage of arrays by subarray divisions-chunking-has been shown to improve i/o performance substantially in many circumstances, In this paper we show how to increase the performance advantages of chunking by combining it with data compression, and describe the results of experiments with compressed chunks from scientific data sets on the Intel iPSC/860. For a particular fixed array size and compression ratio, uncompressed chunk i/o is faster than compressed chunk i/o when the number of processors is small; the reverse holds when the number of processors is large, as the cost of compression as spread over a larger number of processors. With good compression ratios and large numbers of processors, we obtained an effective logical i/o rate for compressed chunks that exceeds the theoretical possible maximum for uncompressed data, by adding compression to an existing chunked i/o library. Our results suggest that compression may be a good technique for handling sparse arrays in parallel i/o.