Remote I/O: fast access to distant storage
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
Adaptive performance prediction for distributed data-intensive applications
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Application-level scheduling on distributed heterogeneous networks
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Operating Systems
Predicting the Performance of Wide Area Data Transfers
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Evaluation of the inter-cluster data transfer on Grid environment
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Forecasting network performance to support dynamic scheduling using the network weather service
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Distant I/O: One-Sided Access to Secondary Storage on Remote Processors
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Matchmaking: Distributed Resource Management for High Throughput Computing
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
The Ethernet Approach to Grid Computing
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
Building the Mass Storage System at Jefferson Lab Ian Bird, Bryan Hess, Andy Kowalski
MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks
Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
The Kangaroo Approach to Data Movement on the Grid
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Performance and Scalability of a Replica Location Service
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Explicit control a batch-aware distributed file system
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Improving GridFTP transfers by means of a multiagent parallel file system
Multiagent and Grid Systems - Grid Computing, high performance and distributed applications
Data placement for scientific applications in distributed environments
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Data Staging Strategies and Their Impact on the Execution of Scientific Workflows
Proceedings of the second international workshop on Data-aware distributed computing
Scheduling data-intensive workflows on storage constrained resources
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
Software as a service for data scientists
Communications of the ACM
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
High performance reliable file transfers using automatic many-to-many parallelization
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Computers and Operations Research
Hopfield neural network for simultaneous job scheduling and data replication in grids
Future Generation Computer Systems
MapReduce framework energy adaptation via temperature awareness
Cluster Computing
Hi-index | 0.03 |
Data placement is an essential part of today's distributed applications since moving the data close to the application has many benefits. The increasing data requirements of both scientific and commercial applications, and collaborative access to these data make it even more important. In the current approach, data placement is regarded as a side affect of computation. Our goal is to make data placement a first class citizen in distributed computing systems just like the computational jobs. They will be queued, scheduled, monitored, managed, and even checkpointed. Since data placement jobs have different characteristics than computational jobs, they cannot be treated in the exact same way as computational jobs. For this purpose, we are proposing a framework which can be considered as a ''data placement subsystem'' for distributed computing systems, similar to the I/O subsystem in operating systems. This framework includes a specialized scheduler for data placement, a high level planner aware of data placement jobs, a resource broker/policy enforcer and some optimization tools. Our system can perform reliable and efficient data placement, it can recover from all kinds of failures without any human intervention, and it can dynamically adapt to the environment at the execution time.