Implementation of the fault tolerance in computational grid using agents by meta-modelling approach

Authors:
C. Srimathi;J. Vaideeswaran
Affiliations:
School of Computing Science and Engineering, VIT University, Vellore, 632014, Tamil Nadu, India;School of Computing Science and Engineering, VIT University, Vellore, 632014, Tamil Nadu, India
Venue:
International Journal of Communication Networks and Distributed Systems
Year:
2013

Citing 8
Cited 0

On scalable and efficient distributed failure detectors

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Trustworthy components-compositionality and prediction

Journal of Systems and Software - Special issue on: Component-based software engineering
Faults in Grids: Why are they so bad and What can be done about it?

GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Brain Meets Brawn: Why Grid and Agents Need Each Other

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Checkpointing-based rollback recovery for parallel applications on the InteGrade grid middleware

MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Recent advances in checkpoint/recovery systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Globus toolkit version 4: software for service-oriented systems

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the grid environment, some of the nodes may be working while others may not be active. Alternatively, all the computers could be operational, but their interconnection network may fail. From the perspective of one computer, such network partitioning may appear as a failure to other computers. These types of failures may lead to a major impact on the whole application, which is executing on the Grid for many days. In this paper we will be meta-modelling the computational grid and implementing the fault tolerant mechanism using Java agents. The purpose of the work proposed in this paper is to automate the development of a computational grid and creating graphical workflows of applications using domain-specific modelling techniques. This paper is to provide a high level view for the construction of Grid applications with the flexibility in design and deployment.