The Hector Distributed Run-Time Environment
IEEE Transactions on Parallel and Distributed Systems
The implementation of dynamite: an environment for migrating PVM tasks
ACM SIGOPS Operating Systems Review
Experiments with Migration of Message-Passing Tasks
GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
Fault-Tolerant Parallel Applications Using Queues and Actions
ICPP '97 Proceedings of the international Conference on Parallel Processing
The Impact of Migration on Parallel Job Scheduling for Distributed Systems
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
MPICH-CM: A Communication Library Design for a P2P MPI Implementation
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration
IEEE Transactions on Parallel and Distributed Systems
The design and implementation of Zap: a system for migrating computing environments
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
A Problem-Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Adaptive incremental checkpointing for massively parallel systems
Proceedings of the 18th annual international conference on Supercomputing
MobiDesk: mobile virtual desktop computing
Proceedings of the 10th annual international conference on Mobile computing and networking
Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 18 - Volume 19
The design and implementation of Zap: a system for migrating computing environments
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
A channel memory based fault tolerance for MPI applications
Future Generation Computer Systems - Special issue: Parallel computing technologies
Strategies for storage of checkpointing data using non-dedicated repositories on Grid systems
MGC '05 Proceedings of the 3rd international workshop on Middleware for grid computing
Strategies for Checkpoint Storage on Opportunistic Grids
IEEE Distributed Systems Online
Reducing downtime due to system maintenance and upgrades
LISA '05 Proceedings of the 19th conference on Large Installation System Administration Conference - Volume 19
DejaView: a personal virtual computer recorder
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Algorithm-based fault tolerance applied to high performance computing
Journal of Parallel and Distributed Computing
Characterizing fault tolerance in genetic programming
BADS '09 Proceedings of the 2009 workshop on Bio-inspired algorithms for distributed systems
A Channel Memory based fault tolerance for MPI applications
Future Generation Computer Systems - Special issue: Parallel computing technologies
Characterizing fault tolerance in genetic programming
Future Generation Computer Systems
Computational efficiency and practical implications for a client grid
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Improving speedup and response times by replicating parallel programs on a SNOW
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Hi-index | 0.00 |