ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization

Authors:
Eric R. Schendel;Saurabh V. Pendse;John Jenkins;David A. Boyuka, II;Zhenhuan Gong;Sriram Lakshminarasimhan;Qing Liu;Hemanth Kolla;Jackie Chen;Scott Klasky;Robert Ross;Nagiza F. Samatova
Affiliations:
North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA;Sandia National Laboratory, Livermore, CA, USA;Sandia National Laboratory, Livermore, CA, USA;Oak Ridge National Laboratory, Oak Ridge , TN, USA;Argonne National Laboratory, Argonne, IL, USA;North Carolina State University & Oak Ridge National Laboratory, Raleigh, NC, USA
Venue:
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Year:
2012

Citing 26
Cited 4

Data Management: NetCDF: an Interface for Scientific Data Access

IEEE Computer Graphics and Applications
Parallel netCDF: A High-Performance Scientific I/O Interface

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
High performance RDMA-based MPI implementation over infiniBand

International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
A Technique for High-Performance Data Compression

Computer
ZOID: I/O-forwarding infrastructure for petascale architectures

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)

CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Scaling parallel I/O performance through I/O delegate and caching system

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Accelerating tropical cyclone analysis using LambdaRAM, a distributed data cache over wide-area ultra-fast networks

Future Generation Computer Systems
FPC: A High-Speed Compressor for Double-Precision Floating-Point Data

IEEE Transactions on Computers
HYDRAstor: a Scalable Secondary Storage

FAST '09 Proccedings of the 7th conference on File and storage technologies
DataStager: scalable data staging services for petascale applications

Proceedings of the 18th ACM international symposium on High performance distributed computing
Adaptable, metadata rich IO methods for portable high performance IO

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
PLFS: a checkpoint filesystem for parallel applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
...and eat it too: high read performance in write-optimized HPC I/O middleware file formats

Proceedings of the 4th Annual Workshop on Petascale Data Storage
Managing Variability in the IO Performance of Petascale Storage Systems

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
MAD2: A scalable high-throughput exact deduplication approach for network backup services

MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Just in time: adding value to the IO pipelines of high performance applications with JITStaging

Proceedings of the 20th international symposium on High performance distributed computing
Building a high-performance deduplication system

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Compressing the incompressible with ISABELA: in-situ reduction of spatio-temporal data

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific data

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Improving I/O Forwarding Throughput with Data Compression

CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
A study of practical deduplication

ACM Transactions on Storage (TOS)
S-preconditioner for Multi-fold Data Reduction with Guaranteed User-Controlled Accuracy

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
ISOBAR Preconditioner for Effective and High-throughput Lossless Data Compression

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Multi-level Layout Optimization for Efficient Spatio-temporal Queries on ISABELA-compressed Data

IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium

Byte-precision level of detail processing for variable precision analytics

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
When is multi-version checkpointing needed?

Proceedings of the 3rd Workshop on Fault-tolerance for HPC at extreme scale
11 PFLOP/s simulations of cloud cavitation collapse

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A generic high-performance method for deinterleaving scientific data

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current peta-scale data analytics frameworks suffer from a significant performance bottleneck due to an imbalance between their enormous computational power and limited I/O bandwidth. Using data compression schemes to reduce the amount of I/O activity is a promising approach to addressing this problem. In this paper, we propose a hybrid framework for interleaving I/O with data compression to achieve improved I/O throughput side-by-side with reduced dataset size. We evaluate several interleaving strategies, present theoretical models, and evaluate the efficiency and scalability of our approach through comparative analysis. With our theoretical model, considering 19 real-world scientific datasets both from the public domain and peta-scale simulations, we estimate that the hybrid method can result in a 12 to 46 increase in throughput on hard-to-compress scientific datasets. At the reported peak bandwidth of 60 GB/s of uncompressed data for a current, leadership-class parallel I/O system, this translates into an effective gain of 7 to 28 GB/s in aggregate throughput.