Smooth and Efficient Integration of High-Availability in a Parallel Single Level Store System
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Achieving causal and total ordering in publish/subscribe middleware with DSM
Proceedings of the 3rd workshop on Middleware for service oriented computing
Hi-index | 0.00 |
A Single Level Store (SLS) integrating a Shared Virtual Memory and a Parallel File System with file mapping as interface is attractive for the execution of high performance applications in a cluster. However, the probability of a node reboot or failure is quite high. In this paper, we present the design of a highly available SLS system. Our approach combines checkpointing in memory and permanent check pointing on disk in a cluster using all cluster memory and disk resources. Preliminary performance results show the applicability of the proposed approach for parallel applications with huge input/output requirements.