Access-pattern and bandwidth aware file replication algorithm in a grid environment

Authors:
H. Sato;S. Matsuoka;T. Endo;N. Maruyama
Affiliations:
Tokyo Inst. of Technol., Tokyo;Tokyo Inst. of Technol., Tokyo;Tokyo Inst. of Technol., Tokyo;Tokyo Inst. of Technol., Tokyo
Venue:
GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
Year:
2008

Citing 13
Cited 4

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Grid Datafarm Architecture for Petascale Data Intensive Computing

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Self-certifying file system

Self-certifying file system
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Explicit control a batch-aware distributed file system

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Study of Different Replica Placement and Maintenance Strategies in Data Grid

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Automatic Clustering of Grid Nodes

GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
A fast topology inference: a building block for network-aware parallel processing

Proceedings of the 16th international symposium on High performance distributed computing
Data driven workflow planning in cluster management systems

Proceedings of the 16th international symposium on High performance distributed computing

File Clustering Based Replication Algorithm in a Grid Environment

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
FIRE: A File Reunion Based Data Replication Strategy for Data Grids

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Efficient Management of Fragmented Replica in Data Grids

International Journal of Grid and High Performance Computing
A classification of file placement and replication methods on grids

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Replication in grid file systems can significantly improve I/O performance of data-intensive grid applications, but its manual creation and placement would be impractical in a real grid environment involving thousands to millions of files accessed per application. Although automatic determination of where and how many replicas should be created should be decided with regards to application access patterns and network throughputs, thereby achieving high application throughput and minimizing replica space overhead, previous studies have focused on limited parameter spaces in their algorithmic optimizations. We propose an automated replication algorithm that allows most of I/O accesses to be performed within a given time threshold, while simultaneously minimizing the space overhead by replication. Our algorithm models the replication problem as a combinatorial optimization problem, where the constraints are derived from the given access time threshold and various system parameters, while the objective function being to minimize file replication costs. We solve the optimization problem by dynamically monitoring and estimating inter-node link throughputs and file access patterns of running applications. Our simulation-based studies suggest that the proposed algorithm can achieve higher performance than simple techniques, such as ones that always or never create replicas, while keeping storage usage very low. The results also indicate that the proposed automated algorithm can perform comparably with manual replica placement.