Movement-based checkpointing and logging for recovery in mobile computing systems
MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Hi-index | 0.00 |
Abstract: This paper presents a new checkpointing coordination scheme which utilizes the information regarding the communication pattern of the target program. We have classified the communication patterns of the processes and found that in most cases, the dependency relation which might cause the cascading rollbacks, called a domino effect, involves only two processes. For such cases, we suggest a cycle detection scheme to prevent the domino effect. Even in other cases, the limited number of processes are mostly involved in the domino effect. Hence, we also suggest the limited coordination scheme in which the coordination involves only the processes specified in the communication pattern. By utilizing the communication pattern of the target program, it is possible to remove the unnecessary coordination effort and the checkpointing frequency can also be reduced. One possible drawback of the proposed scheme is that the rollback distance might get longer in some cases. However, the difference is minimal and we believe that it is a small price at the failure time, compared with the reduced overhead during the normal execution. Extensive simulation has been performed to evaluate the performance of the proposed scheme and we concluded that the proposed scheme significantly reduces the checkpointing overhead compared with the loose coordination schemes.