Quasi-Synchronous Checkpointing: Models, Characterization, and Classification
IEEE Transactions on Parallel and Distributed Systems
CoG kits: a bridge between commodity distributed computing and high-performance grids
Proceedings of the ACM 2000 conference on Java Grande
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Implementation of a CORBA-Based Metacomputing System
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
CoCheck: Checkpointing and Process Migration for MPI
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Checkpointing Facility on a Metasystem
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Making Java applications mobile or persistent
COOTS'01 Proceedings of the 6th conference on USENIX Conference on Object-Oriented Technologies and Systems - Volume 6
Dynamic and Secure Data Access Extensions of Grid Boundaries
GPC '09 Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing
Middleware support for java applications on globus-based grids
GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Extended mpijava for distributed checkpointing and recovery
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
JaDiMa: java applications distributed management on grid platforms
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Remote class prefetching: improving performance of java applications on grid platforms
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
GiPS: a grid portal for executing java applications on globus-based grids
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
This article describes the implementation of checkpointing and recovery services in a Java-based distributed platform. Our case study is suma, a distributed execution platform implemented on top of Grid services. suma has been designed for execution of Java bytecode, with additional support for parallel processing. suma middleware is built on top of commodity software and communication technologies, including Java, Corba, and Globus services. The implementation of suma that runs on top of Globus services is called suma/g.