Plasma fusion code coupling using scalable I/O services and scientific workflows
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Terascale data organization for discovering multivariate climatic trends
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
...and eat it too: high read performance in write-optimized HPC I/O middleware file formats
Proceedings of the 4th Annual Workshop on Petascale Data Storage
Efficient object storage journaling in a distributed parallel file system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Exploiting MISD performance opportunities in multi-core systems
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Just in time: adding value to the IO pipelines of high performance applications with JITStaging
Proceedings of the 20th international symposium on High performance distributed computing
Six degrees of scientific data: reading patterns for extreme scale science IO
Proceedings of the 20th international symposium on High performance distributed computing
Compressing the incompressible with ISABELA: in-situ reduction of spatio-temporal data
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
OMPIO: a modular software architecture for MPI I/O
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Efficient data IO for a Parallel Global Cloud Resolving Model
Environmental Modelling & Software
Examples of in transit visualization
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
High end scientific codes with computational I/O pipelines: improving their end-to-end performance
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
Generalizing mapreduce as a unified cloud and HPC runtime
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
An application-level parallel I/O library for Earth system models
International Journal of High Performance Computing Applications
Extending scalability of collective IO through nessie and staging
Proceedings of the sixth workshop on Parallel Data Storage
In-situ I/O processing: a case for location flexibility
Proceedings of the sixth workshop on Parallel Data Storage
ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Efficient I/O for parallel visualization
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Parallel in situ coupling of simulation with a fully featured visualization system
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Characterizing output bottlenecks in a supercomputer
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Parallel I/O, analysis, and visualization of a trillion particle simulation
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Towards an energy-aware scientific I/O interface
Computer Science - Research and Development
Understanding i/o performance using i/o skeletal applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A Maya use case: adaptable scientific workflows with ADIOS for general relativistic astrophysics
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
Dataflow coordination of data-parallel tasks via MPI 3.0
Proceedings of the 20th European MPI Users' Group Meeting
Insights for exascale IO APIs from building a petascale IO API
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
Since IO performance on HPC machines strongly depends on machine characteristics and configuration, it is important to carefully tune IO libraries and make good use of appropriate library APIs. For instance, on current petascale machines, independent IO tends to outperform collective IO, in part due to bottlenecks at the metadata server. The problem is exacerbated by scaling issues, since each IO library scales differently on each machine, and typically, operates efficiently to different levels of scaling on different machines. With scientific codes being run on a variety of HPC resources, efficient code execution requires us to address three important issues: (1) end users should be able to select the most efficient IO methods for their codes, with minimal effort in terms of code updates or alterations; (2) such performance-driven choices should not prevent data from being stored in the desired file formats, since those are crucial for later data analysis; and (3) it is important to have efficient ways of identifying and selecting certain data for analysis, to help end users cope with the flood of data produced by high end codes. This paper employs ADIOS, the ADaptable IO System, as an IO API to address (1)–(3) above. Concerning (1), ADIOS makes it possible to independently select the IO methods being used by each grouping of data in an application, so that end users can use those IO methods that exhibit best performance based on both IO patterns and the underlying hardware. In this paper, we also use this facility of ADIOS to experimentally evaluate on petascale machines alternative methods for high performance IO. Specific examples studied include methods that use strong file consistency vs. delayed parallel data consistency, as that provided by MPI-IO or POSIX IO. Concerning (2), to avoid linking IO methods to specific file formats and attain high IO performance, ADIOS introduces an efficient intermediate file format, termed BP, which can be converted, at small cost, to the standard file formats used by analysis tools, such as NetCDF and HDF-5. Concerning (3), associated with BP are efficient methods for data characterization, which compute attributes that can be used to identify data sets without having to inspect or analyze the entire data contents of large files.