Armada: a parallel I/O framework for computational grids
Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
XChange: coupling parallel applications in a dynamic environment
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Modeling the Impact of Checkpoints on Next-Generation Systems
MSST '07 Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies
Evaluation of active storage strategies for the lustre parallel file system
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
DataStager: scalable data staging services for petascale applications
Proceedings of the 18th ACM international symposium on High performance distributed computing
Adaptable, metadata rich IO methods for portable high performance IO
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
GePSeA: A General-Purpose Software Acceleration Framework for Lightweight Task Offloading
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Communications of the ACM
Extreme Scaling of Production Visualization Software on Diverse Architectures
IEEE Computer Graphics and Applications
In Situ Visualization for Large-Scale Combustion Simulations
IEEE Computer Graphics and Applications
Streamflow Programming Model for Data Streaming in Scientific Workflows
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
DataSpaces: an interaction and coordination framework for coupled simulation workflows
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Enabling active storage on parallel I/O software stacks
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Exascale computing technology challenges
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Just in time: adding value to the IO pipelines of high performance applications with JITStaging
Proceedings of the 20th international symposium on High performance distributed computing
Six degrees of scientific data: reading patterns for extreme scale science IO
Proceedings of the 20th international symposium on High performance distributed computing
A flexible architecture integrating monitoring and analytics for managing large-scale data centers
Proceedings of the 8th ACM international conference on Autonomic computing
Enabling Multi-physics Coupled Simulations within the PGAS Programming Framework
CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Compressing the incompressible with ISABELA: in-situ reduction of spatio-temporal data
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Moving the Code to the Data - Dynamic Code Deployment Using ActiveSpaces
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Parallel in situ coupling of simulation with a fully featured visualization system
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Hi-index | 0.00 |
Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process output data during simulation time, "in-situ", and before placing data on disks. This paper argues for flexibility in the implementation of such in-situ data analytics, using measurements and a performance model that demonstrate the potential advantages and limitations of performing analytics at different levels of the I/O hierarchy, including on a machine's compute nodes vs. on separate "staging" nodes dedicated to analysis tasks. Model and measurement results are guided by realistic large-scale applications running on leadership class machines, and I/O and analytics actions are described as computational dataflow graphs -- termed I/O graphs -- that combine data movement with 'in transit' operations on data as it is being moved across the I/O hierarchy. Results demonstrate the importance of flexibility in analytics placement and characterize the attributes of analytics operations that lead to different placement decisions.