A generic high-performance method for deinterleaving scientific data

  • Authors:
  • Eric R. Schendel;Steve Harenberg;Houjun Tang;Venkatram Vishwanath;Michael E. Papka;Nagiza F. Samatova

  • Affiliations:
  • North Carolina State University, Raleigh, NC and Argonne National Laboratory, Argonne, IL and Oak Ridge National Laboratory, Oak Ridge, TN;North Carolina State University, Raleigh, NC and Oak Ridge National Laboratory, Oak Ridge, TN;North Carolina State University, Raleigh, NC and Oak Ridge National Laboratory, Oak Ridge, TN;Argonne National Laboratory, Argonne, IL;Northern Illinois University, DeKalb, IL and Argonne National Laboratory, Argonne, IL;North Carolina State University, Raleigh, NC and Oak Ridge National Laboratory, Oak Ridge, TN

  • Venue:
  • Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-performance and energy-efficient data management applications are a necessity for HPC systems due to the extreme scale of data produced by high fidelity scientific simulations that these systems support. Data layout in memory hugely impacts the performance. For better performance, most simulations interleave variables in memory during their calculation phase, but deinterleave the data for subsequent storage and analysis. As a result, efficient data deinterleaving is critical; yet, common deinterleaving methods provide inefficient throughput and energy performance. To address this problem, we propose a deinterleaving method that is high performance, energy efficient, and generic to any data type. To the best of our knowledge, this is the first deinterleaving method that 1) exploits data cache prefetching, 2) reduces memory accesses, and 3) optimizes the use of complete cache line writes. When evaluated against conventional deinterleaving methods on 105 STREAM standard micro-benchmarks, our method always improved throughput and throughput/watt on multi-core systems. In the best case, our deinterleaving method improved throughput up to 26.2x and throughput/watt up to 7.8x.