Portals 3.0: Protocol Building Blocks for Low Overhead Communication
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Improving Collective I/O Performance Using Threads
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Parallel netCDF: A High-Performance Scientific I/O Interface
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Scaling parallel I/O performance through I/O delegate and caching system
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
ParColl: Partitioned Collective I/O on the Cray XT
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
LIVE data workspace: A flexible, dynamic and extensible platform for petascale applications
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Adaptable, metadata rich IO methods for portable high performance IO
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
DataSpaces: an interaction and coordination framework for coupled simulation workflows
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Six degrees of scientific data: reading patterns for extreme scale science IO
Proceedings of the 20th international symposium on High performance distributed computing
An application-level parallel I/O library for Earth system models
International Journal of High Performance Computing Applications
Insights for exascale IO APIs from building a petascale IO API
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scientific Programming - A New Overview of the Trilinos Project --Part 1
Hi-index | 0.00 |
The increasing fidelity of scientific simulations as they scale towards exascale sizes is straining the proven IO techniques championed throughout terascale computing. Chief among the successful IO techniques is the idea of collective IO where processes coordinate and exchange data prior to writing to storage in an effort to reduce the number of small, independent IO operations. As well as collective IO works for efficiently creating a data set in the canonical order, 3-D domain decompositions prove troublesome due to the amount of data exchanged prior to writing to storage. When each process has a tiny piece of a 3-D simulation space rather than a complete 'pencil' or 'plane', 2-D or 1-D domain decompositions respectively, the communication overhead to rearrange the data can dwarf the time spent actually writing to storage [27]. Our approach seeks to transparently increase scalability and performance while maintaining both the IO routines in the application and the final data format in the storage system. Accomplishing this leverages both the Nessie [23] RPC framework and a staging area with staging services. Through these tools, we employ a variety of data processing operations prior to invoking the native API to write data to storage yielding as much as a 3X performance improvement over the native calls.