Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems

Authors:
Jean-François Hermant;Gérard Le Lann
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
2002

Citing 17
Cited 13

Multiple-access protocols and time-constrained communication

ACM Computing Surveys (CSUR)
On the minimal synchronism needed for distributed consensus

Journal of the ACM (JACM)
Consensus in the presence of partial synchrony

Journal of the ACM (JACM)
Analysis of hard real-time communications

Real-Time Systems
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
The Timed Asynchronous Distributed System Model

IEEE Transactions on Parallel and Distributed Systems
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment

Journal of the ACM (JACM)
Indulgent algorithms (preliminary version)

Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Distributed Algorithms

Distributed Algorithms
On Real-Time and Non Real-Time Distributed Computing

WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Proof-Based System Engineering and Embedded Systems

Lectures on Embedded Systems, European Educational Forum, School on Embedded Systems
he Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Consensus: The Big Misunderstanding

FTDCS '97 Proceedings of the 6th IEEE Workshop on Future Trends of Distributed Computing Systems
Consensus Based on Failure Detectors with a Perpetual Accuracy Property

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
A Protocol and Correctness Proofs for Real-Time High-Performance Broadcast Networks

ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Asynchronous Protocols to Meet Real-Time Constraints: Is It Really Sensible? How to Proceed?

ISORC '98 Proceedings of the The 1st IEEE International Symposium on Object-Oriented Real-Time Distributed Computing

The Timewheel Group Communication System

IEEE Transactions on Computers
The Timely Computing Base Model and Architecture

IEEE Transactions on Computers
On the Impact of Fast Failure Detectors on Real-Time Fault-Tolerant Systems

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Fast Scheduling of Distributable Real-Time Threads with Assured End-to-End Timeliness

Ada-Europe '08 Proceedings of the 13th Ada-Europe international conference on Reliable Software Technologies
Towards a real-time distributed computing model

Theoretical Computer Science
Recovering from distributable thread failures in distributed real-time Java

ACM Transactions on Embedded Computing Systems (TECS)
The failure detector abstraction

ACM Computing Surveys (CSUR)
Group communication: from practice to theory

SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Failure detection with booting in partially synchronous systems

EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Novel generic middleware building blocks for dependable modular avionics systems

EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Proof-based system engineering using a virtual system model

ISAS'05 Proceedings of the Second international conference on Service Availability
Implementing reliable distributed real-time systems with the Θ-model

OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Dependable systems

Dependable Systems

Quantified Score

Hi-index	14.99

Visualization

Abstract

We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the Uniform Consensus (UC) and the Real-Time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon Unreliable Failure Detectors. We introduce the concept of Fast Failure Detectors and we show that the problem of building Strong or Perfect Fast Failure Detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given, illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of Unreliable Failure Detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound, that is, (t+1)D, where t is the maximum number of processors that may crash and D is the maximum interprocess message delay. Finally, we introduce FastUC, a novel solution to UC, that is based upon Fast Failure Detectors. FastUC has a worst-case termination time that is sublinear in tD. For most practical cases and common values of t, FastUC terminates in D, making it a worst-case time optimal solution to Real-Time UC.