Communications of the ACM
PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Orca: A Language for Parallel Programming of Distributed Systems
IEEE Transactions on Software Engineering
OCM—a monitoring system for interoperable tools
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
The Unified Modeling Language user guide
The Unified Modeling Language user guide
Thread migration and its applications in distributed shared memory systems
Journal of Systems and Software
Distributed Shared Memory: Concepts and Systems
Distributed Shared Memory: Concepts and Systems
CoCheck: Checkpointing and Process Migration for MPI
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Cashmere-VLM: Remote Memory Paging for Software Distributed Shared Memory
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Distributed-Thread Scheduling Methods for Reducing Page-Thrashing
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Brazos: a third generation DSM system
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
CORAL - online monitoring in distributed applications: issues and solutions
WSEAS Transactions on Computers
Hi-index | 0.01 |
In this paper we prove that process migration can successfully be implemented for software DSM environments. We have developed a migration framework that is able to transparently migrate DSM processes, thereby preserving the consistency of running applications. The migration framework is integrated into the CORAL system, an on-line monitoring system that connects parallel tools to a running application. A special emphasis has been put on techniques and mechanisms for migration of shared resources and communication channels as well as internal monitoring data structures. Currently, the migration framework migrates parallel processes based on the TreadMarks library. The Condor library has been utilized for the state transfer of a single process. In the computing environment consisting of eight nodes running TreadMarks applications, the migration framework brings 10 % overhead to Condor and grows almost linearly with added nodes. Although our first implementation supports TreadMarks applications, both the monitoring system and the migration framework are designed to be reusable and easily adaptable to other software DSM systems.