A framework for reliable and efficient data placement in distributed computing systems

Authors:
Tevfik Kosar;Miron Livny
Affiliations:
Computer Sciences Department, University of Wisconsin-Madison 1210 West Dayton Street, Madison WI 53706, USA;Computer Sciences Department, University of Wisconsin-Madison 1210 West Dayton Street, Madison WI 53706, USA
Venue:
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Year:
2005

Citing 21
Cited 13

Remote I/O: fast access to distant storage

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Automatic TCP buffer tuning

Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
Adaptive performance prediction for distributed data-intensive applications

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Application-level scheduling on distributed heterogeneous networks

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Operating Systems

Operating Systems
Predicting the Performance of Wide Area Data Transfers

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation

SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
The SDSC storage resource broker

CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Evaluation of the inter-cluster data transfer on Grid environment

CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Forecasting network performance to support dynamic scheduling using the network weather service

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Distant I/O: One-Sided Access to Secondary Storage on Remote Processors

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Matchmaking: Distributed Resource Management for High Throughput Computing

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
The Ethernet Approach to Grid Computing

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing

MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
Building the Mass Storage System at Jefferson Lab Ian Bird, Bryan Hess, Andy Kowalski

MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks

Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks
Condor-G: A Computation Management Agent for Multi-Institutional Grids

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
The Kangaroo Approach to Data Movement on the Grid

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Performance and Scalability of a Replica Location Service

HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Explicit control a batch-aware distributed file system

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1

The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Improving GridFTP transfers by means of a multiagent parallel file system

Multiagent and Grid Systems - Grid Computing, high performance and distributed applications
Data placement for scientific applications in distributed environments

GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Data Staging Strategies and Their Impact on the Execution of Scientific Workflows

Proceedings of the second international workshop on Data-aware distributed computing
Scheduling data-intensive workflows on storage constrained resources

Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
A data placement strategy in scientific cloud workflows

Future Generation Computer Systems
Data transfer planning with tree placement for collaborative environments

Constraints
Software as a service for data scientists

Communications of the ACM
Graph-Cut Based Coscheduling Strategy Towards Efficient Execution of Scientific Workflows in Collaborative Cloud Environments

GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
High performance reliable file transfers using automatic many-to-many parallelization

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
A Bee Colony based optimization approach for simultaneous job scheduling and data replication in grid environments

Computers and Operations Research
Hopfield neural network for simultaneous job scheduling and data replication in grids

Future Generation Computer Systems
MapReduce framework energy adaptation via temperature awareness

Cluster Computing

Quantified Score

Hi-index	0.03

Visualization

Abstract

Data placement is an essential part of today's distributed applications since moving the data close to the application has many benefits. The increasing data requirements of both scientific and commercial applications, and collaborative access to these data make it even more important. In the current approach, data placement is regarded as a side affect of computation. Our goal is to make data placement a first class citizen in distributed computing systems just like the computational jobs. They will be queued, scheduled, monitored, managed, and even checkpointed. Since data placement jobs have different characteristics than computational jobs, they cannot be treated in the exact same way as computational jobs. For this purpose, we are proposing a framework which can be considered as a ''data placement subsystem'' for distributed computing systems, similar to the I/O subsystem in operating systems. This framework includes a specialized scheduler for data placement, a high level planner aware of data placement jobs, a resource broker/policy enforcer and some optimization tools. Our system can perform reliable and efficient data placement, it can recover from all kinds of failures without any human intervention, and it can dynamically adapt to the environment at the execution time.