The mutual exclusion problem: partII—statement and solutions
Journal of the ACM (JACM)
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
JCIT Proceedings of the fifth Jerusalem conference on Information technology
Computing with faulty shared memory
PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
Bounds on shared memory for mutual exclusion
Information and Computation
Computing with faulty shared objects
Journal of the ACM (JACM)
Fault-tolerant wait-free shared objects
Journal of the ACM (JACM)
Concurrent Reading While Writing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
How to Construct an Atomic Variable (Extended Abstract)
Proceedings of the 3rd International Workshop on Distributed Algorithms
Shared-Memory Simulations on a Faulty-Memory DMM
ICALP '96 Proceedings of the 23rd International Colloquium on Automata, Languages and Programming
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Deterministic computations on a PRAM with static processor and memory faults
Fundamenta Informaticae
Model Checking Linearizability via Refinement
FM '09 Proceedings of the 2nd World Congress on Formal Methods
Proving linearizability via non-atomic refinement
IFM'07 Proceedings of the 6th international conference on Integrated formal methods
Designing reliable algorithms in unreliable memories
ESA'05 Proceedings of the 13th annual European conference on Algorithms
From unreliable objects to reliable objects: the case of atomic registers and consensus
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Hi-index | 0.00 |
We study the behavior of mutual exclusion algorithms in the presence of unreliable shared memory subject to transient memory faults. It is well-known that classical 2-process mutual exclusion algorithms, such as Dekker and Peterson's algorithms, are not fault-tolerant; in this paper we ask what degree of fault tolerance can be achieved using the same restricted resources as Dekker and Peterson's algorithms, namely, three binary read/write registers. We show that if one memory fault can occur, it is not possible to guarantee both mutual exclusion and deadlock-freedom using three binary registers; this holds in general when fewer than 2f+1 binary registers are used and f may be faulty. Hence we focus on algorithms that guarantee (a) mutual exclusion and starvation-freedom in fault-free executions, and (b) only mutual exclusion in faulty executions. We show that using only three binary registers it is possible to design an 2-process mutual exclusion algorithm which tolerates a single memory fault in this manner. Further, by replacing one read/write register with a test&set register, we can guarantee mutual exclusion in executions where one variable experiences unboundedly many faults. In the more general setting where up to f registers may be faulty, we show that it is not possible to guarantee mutual exclusion using 2f + 1 binary read/write registers if each faulty register can exhibit unboundedly many faults. On the positive side, we show that an n-variable single-fault tolerant algorithm satisfying certain conditions can be transformed into an ((n-1)f + 1)-variable f-fault tolerant algorithm with the same progress guarantee as the original. In combination with our three-variable algorithm, this implies that there is a (2f+1)-variable mutual exclusion algorithm tolerating a single fault in up to f variables without violating mutual exclusion.