State Checksum and Its Role in System Stabilization

Authors:
Chin-Tser Huang;Mohamed G. Gouda
Affiliations:
University of South Carolina at Columbia;University of Texas at Austin
Venue:
ICDCSW '05 Proceedings of the Fourth International Workshop on Assurance in Distributed Systems and Networks (ADSN) (ICDCSW'05) - Volume 01
Year:
2005

Citing 0
Cited 2

Safe and Eventually Safe: Comparing Self-stabilizing and Non-stabilizing Algorithms on a Common Ground

OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
Fault masking in tri-redundant systems

SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although a self-stabilizing system that suffers from a transient fault is guaranteed to converge to a legitimate state after a finite number of steps, the convergence can be slow if the harmful effects of the fault are allowed to propagate into many processes in the system. Moreover, some safety properties of the system may be violated during the convergence. To address these problems, we propose in this paper the concept of a state checksum -- a redundancy that can be added to the state of a self-stabilizing system so that some classes of faults become visible to the system, and the system can limit the propagation of their harmful effects, and maintain its safety properties during the convergence. To make these concepts concrete, we discuss the case study of a token ring and show how to use fault-detecting and fault-correcting checksums to detect visible faults, limit the propagation of their harmful effects, and ensure that the safety properties of the ring are maintained during the convergence from these faults.