Byte-precision level of detail processing for variable precision analytics

  • Authors:
  • John Jenkins;Eric R. Schendel;Sriram Lakshminarasimhan;David A. Boyuka, II;Terry Rogers;Stephane Ethier;Robert Ross;Scott Klasky;Nagiza F. Samatova

  • Affiliations:
  • North Carolina State University, NC and Oak Ridge National Laboratory, TN;North Carolina State University, NC and Oak Ridge National Laboratory, TN;North Carolina State University, NC and Oak Ridge National Laboratory, TN;North Carolina State University, NC and Oak Ridge National Laboratory, TN;North Carolina State University, NC and Oak Ridge National Laboratory, TN;Princeton Plasma Physics Laboratory, NJ;Argonne National Laboratory, IL;Oak Ridge National Laboratory, TN;North Carolina State University, NC and Oak Ridge National Laboratory, TN

  • Venue:
  • SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

I/O bottlenecks in HPC applications are becoming a more pressing problem as compute capabilities continue to outpace I/O capabilities. While double-precision simulation data often must be stored losslessly, the loss of some of the fractional component may introduce acceptably small errors to many types of scientific analyses. Given this observation, we develop a precision level of detail (APLOD) library, which partitions double-precision datasets along user-defined byte boundaries. APLOD parameterizes the analysis accuracy-I/O performance tradeoff, bounds maximum relative error, maintains I/O access patterns compared to full precision, and operates with low overhead. Using ADIOS as an I/O use-case, we show proportional reduction in disk access time to the degree of precision. Finally, we show the effects of partial precision analysis on accuracy for operations such as k-means and Fourier analysis, finding a strong applicability for the use of varying degrees of precision to reduce the cost of analyzing extreme-scale data.