Distributed error confinement

Authors:
Yossi Azar;Shay Kutten;Boaz Patt-Shamir
Affiliations:
Tel Aviv University, Tel Aviv, Israel;Technion, Haifa, Israel;HP Cambridge Research Lab, Cambridge, MA
Venue:
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Year:
2003

Citing 25
Cited 12

Distributed Nodes Organization Algorithm for Channel Access in a Multihop Dynamic Radio Network

IEEE Transactions on Computers
Self-stabilization of dynamic systems assuming only read/write atomicity

PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Memory-efficient self stabilizing protocols for general networks

Proceedings of the 4th international workshop on Distributed algorithms
Shortest paths without a map

Theoretical Computer Science
Self-stabilization by local checking and correction (extended abstract)

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Time optimal self-stabilizing synchronization

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Fault-local distributed mending (extended abstract)

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
SuperStabilizing protocols for dynamic distributed systems

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
A highly safe self-stabilizing mutual exclusion algorithm

Information Processing Letters
Fault-containing self-stabilizing algorithms

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Stabilizing time-adaptive protocols

Theoretical Computer Science
Searching in an unknown environment: an optimal randomized algorithm for the cow-path problem

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Self-stabilizing unidirectional network algorithms by power-supply

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Self-stabilizing systems in spite of distributed control

Communications of the ACM
Distributed Algorithms

Distributed Algorithms
Fault Tolerance: Principles and Practice

Fault Tolerance: Principles and Practice
Self-Stabilization by Counter Flushing

SIAM Journal on Computing
Self-Stabilization by Local Checking and Global Reset (Extended Abstract)

WDAG '94 Proceedings of the 8th International Workshop on Distributed Algorithms
State-optimal snap-stabilizing PIF in tree networks

ICDCS '99 Workshop on Self-stabilizing Systems
Non-Exploratory Self-Stabilization for Constant-Space Symmetry-Breaking

ESA '94 Proceedings of the Second Annual European Symposium on Algorithms
Diffusion without false rumors: on propagating updates in a Byzantine environment

Theoretical Computer Science
Practical Techniques for Damage Confinement in Software

CSDA '98 Proceedings of the Conference on Computer Security, Dependability, and Assurance: From Needs to Solutions
Local Stabilizer

ISTCS '97 Proceedings of the Fifth Israel Symposium on the Theory of Computing Systems (ISTCS '97)
Self-stabilizing extensions for message-passing systems

Distributed Computing - Special issue: Self-stabilization

LSRP: local stabilization in shortest path routing

IEEE/ACM Transactions on Networking (TON)
Veracity radius: capturing the locality of distributed computations

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Want scalable computing?: speculate!

ACM SIGACT News
Guaranteed fault containment and local stabilization in routing

Computer Networks: The International Journal of Computer and Telecommunications Networking
Barricade: defending systems against operator mistakes

Proceedings of the 5th European conference on Computer systems
A 1-strong self-stabilizing transformer

SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
Composition of fault-containing protocols based on recovery waiting fault-containing composition framework

SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
Necessary and sufficient conditions for 1-adaptivity

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Efficient dynamic aggregation

DISC'06 Proceedings of the 20th international conference on Distributed Computing
A hierarchy-based fault-local stabilizing algorithm for tracking in sensor networks

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
Adaptive stabilization of reactive protocols

FSTTCS'04 Proceedings of the 24th international conference on Foundations of Software Technology and Theoretical Computer Science
Output stability versus time till output

DISC'07 Proceedings of the 21st international conference on Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We initiate the study of error confinement in distributed applications, where the goal is that only nodes that were directly hit by a fault may deviate from their correct external behavior, and only temporarily. The external behavior of all other nodes must remain impeccable, even though their internal state may be affected. Error confinement is impossible if an adversary is allowed to inflict arbitrary transient faults on the system, since the faults might completely wipe out input values. We introduce a new fault tolerance measure we call agility, which quantifies the strength of an algorithm that disseminate information, against state corrupting faults.We study the basic problem of broadcast, and propose algorithms that guarantee error confinement with optimal agility to within a constant factor, even in asynchronous networks when the topology is unknown. These algorithms can serve as building blocks in more general reactive systems. Previous results in exploring locality in reactive systems were not error confined, and relied on the assumption (not used in current paper) that the errors hitting each node are probabilistic, such that a faulty node itself, or its neighbor, can detect the node faulty.The main algorithm uses the novel core bootstrapping technique, that seems inherent for voting in reactive networks; its analysis leads to an interesting combinatorial problem. The technique and the analysis may be of independent interest