LIVE data workspace: A flexible, dynamic and extensible platform for petascale applications

Authors:
Hasan Abbasi;Matthew Wolf;Karsten Schwan
Affiliations:
College of Computing, Georgia Institute of Technology, Atlanta, USA;College of Computing, Georgia Institute of Technology, Atlanta, USA;College of Computing, Georgia Institute of Technology, Atlanta, USA
Venue:
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Year:
2007

Citing 0
Cited 10

Vpm tokens: virtual machine-aware power budgeting in datacenters

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
DART: a substrate for high speed asynchronous data IO

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)

CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
DataStager: scalable data staging services for petascale applications

Proceedings of the 18th ACM international symposium on High performance distributed computing
VPM tokens: virtual machine-aware power budgeting in datacenters

Cluster Computing
Modeling resource-coupled computations

Proceedings of the 2009 Workshop on Ultrascale Visualization
DataStager: scalable data staging services for petascale applications

Cluster Computing
Managing Variability in the IO Performance of Petascale Storage Systems

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Just in time: adding value to the IO pipelines of high performance applications with JITStaging

Proceedings of the 20th international symposium on High performance distributed computing
Extending scalability of collective IO through nessie and staging

Proceedings of the sixth workshop on Parallel Data Storage

Quantified Score

Hi-index	0.00

Visualization

Abstract

The data needs of current and future PetaScale applications have increased over the last half decade to the extent that appropriate data management has become a crucial requirement. This concerns not only the storage of data produced by the new class of PetaScale applications, but also the data exchanges needed for coupling applications with concurrent analysis, online data visualization for validation, and others. To address such dynamic code coupling, we introduce the concept of an extensible, dynamic, and flexible data workspace, termed LIVE. In contrast to the data exchanges programmed with MPI, MPI-IO, or grid software, LIVE focuses on data exchanges carried out without a priori knowledge of potential data requirements. Examples include exchanges required by ad-hoc or dynamically determined methods for data validation, for general data analysis tasks, or for data visualization. Run on an execution environment comprised of integrated dynamic discovery and on-line management services, LIVE is used to create a ‘data workspace’ for a working molecular dynamics code base utilized by mechanical and materials engineers at Georgia Tech, for multi-scale materials modeling. Measurements of both this application’s data workspace and of the basic primitives in the LIVE framework demonstrate that the environment’s substantial flexibility has minimal impact on overall performance, and in fact, that it improves performance in a number of usage scenarios. In particular, for a visualization pipeline example derived from our collaborators, we see a slight improvement over a solution based on MPI-IO, and a further improvement of up to 5% by utilizing LIVE’s ability to overlap communication with user-specified computation.