A result-data offloading service for HPC centers

Authors:
Henry Monti;Ali R. Butt;Sudharshan S. Vazhkudai
Affiliations:
Virginia Polytechnic Institute and State University, Blacksburg, VA;Virginia Polytechnic Institute and State University, Blacksburg, VA;Oak Ridge National Laboratory, Oak Ridge, TN
Venue:
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Year:
2007

Citing 12
Cited 0

A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems

Software—Practice & Experience
GASS: a data movement and access service for wide area computing systems

Proceedings of the sixth workshop on I/O in parallel and distributed systems
The network weather service: a distributed resource performance forecasting service for metacomputing

Future Generation Computer Systems - Special issue on metacomputing
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
The Kangaroo Approach to Data Movement on the Grid

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Bullet: high bandwidth data dissemination using an overlay mesh

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Shark: scaling file servers via cooperative caching

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Using random subsets to build scalable network services

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Scale and performance in the CoBlitz large-file distribution service

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Are P2P Data-Dissemination Techniques Viable in Today's Data-Intensive Scientific Collaborations?

Euro-Par '07 Proceedings of the 13th European international conference on Parallel Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Modern High-Performance Computing applications are consuming and producing an exponentially increasing amount of data. This increase has lead to a significant number of resources being dedicated to data staging in and out of Supercomputing Centers. The typical approach to staging is a direct transfer of application data between the center and the application submission site. Such a direct data transfer approach becomes problematic, especially for staging-out, as (i) the data transfer time increases with the size of data, and may exceed the time allowed by the center's purge policies; and (ii) the submission site may not be online to receive the data, thus further increasing the chances for output data to be purged. In this paper, we argue for a systematic data staging-out approach that utilizes intermediary data-holding nodes to quickly offload data from the center to the intermediaries, thus avoiding the peril of a purge and addressing the two issues mentioned above. The intermediary nodes provide temporary data storage for the staged-out data and maximize the offload bandwidth by providing multiple dataflow paths from the center to the submission site. Our initial investigation shows such a technique to be effective in addressing the above two issues and providing better QOS guarantees for data retrieval.