Optimizing Data Management in Grid Environments

Authors:
Antonis Zissimos;Katerina Doka;Antony Chazapis;Dimitrios Tsoumakos;Nectarios Koziris
Affiliations:
School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens,;School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens,;School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens,;School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens,;School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens,
Venue:
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Year:
2009

Citing 13
Cited 0

A security architecture for computational grids

CCS '98 Proceedings of the 5th ACM conference on Computer and communications security
File and Object Replication in Data Grids

Cluster Computing
Data management and transfer in high-performance computational grid environments

Parallel Computing - Parallel data-intensive algorithms and applications
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
A Decentralized, Adaptive Replica Location Mechanism

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
The Kangaroo Approach to Data Movement on the Grid

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
BOINC: A System for Public-Resource Computing and Storage

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Performance and Scalability of a Replica Location Service

HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
A Peer-to-Peer Replica Location Service Based on a Distributed Hash Table

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A Peer-to-Peer Replica Management Service for High-Throughput Grids

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Collaborative Data Distribution with BitTorrent for Computational Desktop Grids

ISPDC '05 Proceedings of the The 4th International Symposium on Parallel and Distributed Computing
The Composite Endpoint Protocol (CEP): scalable endpoints for terabit flows

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

Grids currently serve as platforms for numerous scientific as well as business applications that generate and access vast amounts of data. In this paper, we address the need for efficient, scalable and robust data management in Grid environments. We propose a fully decentralized and adaptive mechanism comprising of two components: A Distributed Replica Location Service (DRLS ) and a data transfer mechanism called GridTorrent . They both adopt Peer-to-Peer techniques in order to overcome performance bottlenecks and single points of failure. On one hand, DRLS ensures resilience by relying on a Byzantine-tolerant protocol and is able to handle massive concurrent requests even during node churn. On the other hand, GridTorrent allows for maximum bandwidth utilization through collaborative sharing among the various data providers and consumers. The proposed integrated architecture is completely backwards-compatible with already deployed Grids. To demonstrate these points, experiments have been conducted in LAN as well as WAN environments under various workloads. The evaluation shows that our scheme vastly outperforms the conventional mechanisms in both efficiency (up to 10 times faster) and robustness in case of failures and flash crowd instances.