ZOID: I/O-forwarding infrastructure for petascale architectures

Authors:
Kamil Iskra;John W. Romein;Kazutomo Yoshii;Pete Beckman
Affiliations:
Argonne National Laboratory, Argonne, IL, USA;Stichting ASTRON (Netherlands Foundation for Research in Astronomy), Dwingeloo, Netherlands;Argonne National Laboratory, Argonne, IL, USA;Argonne National Laboratory, Argonne, IL, USA
Venue:
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Year:
2008

Citing 5
Cited 26

Operating system issues for petascale systems

ACM SIGOPS Operating Systems Review
Astronomical real-time streaming signal processing on a Blue Gene/L supercomputer

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Designing a highly-scalable operating system: the Blue Gene/L story

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Benchmarking the effects of operating system interference on extreme-scale parallel machines

Cluster Computing

Latency Hiding File I/O for Blue Gene Systems

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Multiple-Level MPI File Write-Back and Prefetching for Blue Gene Systems

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The LOFAR correlator: implementation and performance analysis

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
A Scalable Message Passing Interface Implementation of an Ad-Hoc Parallel I/o system

International Journal of High Performance Computing Applications
Remote Process Execution and Remote File I/O for Heterogeneous Processors in Cluster Systems

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Providing a cloud network infrastructure on a supercomputer

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Accelerating I/O Forwarding in IBM Blue Gene/P Systems

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Performance and Scalability Evaluation of 'Big Memory' on Blue Gene Linux

International Journal of High Performance Computing Applications
Extending and benchmarking the "Big Memory" implementation on Blue Gene/P Linux

Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Just in time: adding value to the IO pipelines of high performance applications with JITStaging

Proceedings of the 20th international symposium on High performance distributed computing
AME: an anyscale many-task computing engine

Proceedings of the 6th workshop on Workflows in support of large-scale science
Towards scalable I/O architecture for exascale systems

Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
Bridging HPC and grid file i/o with IOFSL

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
ExaScale high performance computing in the square kilometer array

Proceedings of the 2012 workshop on High-Performance Computing for Astronomy Date
Enabling event tracing at leadership-class scale through I/O forwarding middleware

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
The RAMDISK storage accelerator: a method of accelerating I/O performance on HPC systems using RAMDISKs

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
McrEngine: a scalable checkpointing system using data-aware aggregation and compression

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
PRACE DECI (distributed european computing initiative) minisymposium

PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
A 1 PB/s file system to checkpoint three million MPI tasks

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Memory-conscious collective I/O for extreme scale HPC systems

Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
A patch-based data reorganization method for coupling large-scale simulations and parallel visualization

Transactions on Edutainment IX
McrEngine: A scalable checkpointing system using data-aware aggregation and compression

Scientific Programming - Selected Papers from Super Computing 2012
Optimizing I/O forwarding techniques for extreme-scale event tracing

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ZeptoOS project is developing an open-source alternative to the proprietary software stacks available on contemporary massively parallel architectures. The aim is to enable computer science research on these architectures, enhance community collaboration, and foster innovation. In this paper, we introduce a component of ZeptoOS called ZOID---an I/O-forwarding infrastructure for architectures such as IBM Blue Gene that decouple file and socket I/O from the compute nodes, shipping those functions to dedicated I/O nodes. Through the use of optimized network protocols and data paths, as well as a multithreaded daemon running on I/O nodes, ZOID provides greater performance than does the stock infrastructure. We present a set of benchmark results that highlight the improvements. Crucially, the flexibility of our infrastructure is a vast improvement over the stock infrastructure, allowing users to forward data using custom-designed application interfaces, through an easy-to-use plug-in mechanism. This capability is used for real-time telescope data transfers, extensively discussed in the paper. Plug-in--specific threads implement prefetching of data obtained over sockets from an input cluster and merge results from individual compute nodes before sending them out, significantly reducing required network bandwidth. This approach allows a ZOID version of the application to handle a larger number of subbands per I/O node, or even to bypass the input cluster altogether, plugging the input from remote receiver stations directly into the I/O nodes. Using the resources more efficiently can result in considerable savings.