A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems
Software—Practice & Experience
GASS: a data movement and access service for wide area computing systems
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Future Generation Computer Systems - Special issue on metacomputing
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
The Kangaroo Approach to Data Movement on the Grid
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Bullet: high bandwidth data dissemination using an overlay mesh
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Shark: scaling file servers via cooperative caching
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Using random subsets to build scalable network services
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Scale and performance in the CoBlitz large-file distribution service
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Are P2P Data-Dissemination Techniques Viable in Today's Data-Intensive Scientific Collaborations?
Euro-Par '07 Proceedings of the 13th European international conference on Parallel Processing
Hi-index | 0.01 |
Modern High-Performance Computing applications are consuming and producing an exponentially increasing amount of data. This increase has lead to a significant number of resources being dedicated to data staging in and out of Supercomputing Centers. The typical approach to staging is a direct transfer of application data between the center and the application submission site. Such a direct data transfer approach becomes problematic, especially for staging-out, as (i) the data transfer time increases with the size of data, and may exceed the time allowed by the center's purge policies; and (ii) the submission site may not be online to receive the data, thus further increasing the chances for output data to be purged. In this paper, we argue for a systematic data staging-out approach that utilizes intermediary data-holding nodes to quickly offload data from the center to the intermediaries, thus avoiding the peril of a purge and addressing the two issues mentioned above. The intermediary nodes provide temporary data storage for the staged-out data and maximize the offload bandwidth by providing multiple dataflow paths from the center to the submission site. Our initial investigation shows such a technique to be effective in addressing the above two issues and providing better QOS guarantees for data retrieval.