Introduction to algorithms
IEEE Transactions on Parallel and Distributed Systems
WWW Traffic Reduction and Load Balancing through Server-Based Caching
IEEE Parallel & Distributed Technology: Systems & Technology
Enhancing the Web's Infrastructure: From Caching to Replication
IEEE Internet Computing
An Efficient Scheme for Dynamic Data Replication
An Efficient Scheme for Dynamic Data Replication
Block-cyclic redistribution over heterogeneous networks
Cluster Computing
Query Merging: Improving Query Subscription Processing in a Multicast Environment
IEEE Transactions on Knowledge and Data Engineering
Data Staging for On-Demand Broadcast
Proceedings of the 27th International Conference on Very Large Data Bases
Disk cache replacement algorithm for storage resource managers in data grids
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Accurate Modeling of Cache Replacement Policies in a Data Grid
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Efficient collective communication in distributed heterogeneous systems
Journal of Parallel and Distributed Computing
Optimal File-Bundle Caching Algorithms for Data-Grids
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Cost-effective multicast approaches for time-critical applications in dynamic network environments
Journal of High Speed Networks
On grid performance evaluation using synthetic workloads
JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
A QoS performance measure framework for distributed heterogeneous networks
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Hi-index | 0.00 |
Data staging is an important data management problem for a distributed heterogeneous networking environment, where each data storage location and intermediate node may have specific data available, storage limitations, and communication links. Sites in the network request data items and each item is associated with a specific deadline and priority. It is assumed that not all requests can be satisfied by their deadline. This work concentrates on solving a basic version of the data staging problem in which all parameter values for the communication system and the data request information represent the best known information collected so far and stay fixed throughout the scheduling process. A mathematical model for the basic data staging problem is introduced. Then, a multiple-source shortest-path algorithm based heuristic for finding a suboptimal schedule of the communication steps for data staging is presented. A simulation study is provided, which evaluates the performance of the proposed heuristic. The results show the advantages of the proposed heuristic over two random based scheduling techniques. This research, based on the simplified static model, serves as a necessary step toward solving the more realistic and complicated version of the data staging problem involving dynamic scheduling, fault tolerance, and determining where to stage data.