Multiple-access protocols and time-constrained communication
ACM Computing Surveys (CSUR)
On the minimal synchronism needed for distributed consensus
Journal of the ACM (JACM)
Consensus in the presence of partial synchrony
Journal of the ACM (JACM)
Analysis of hard real-time communications
Real-Time Systems
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
The Timed Asynchronous Distributed System Model
IEEE Transactions on Parallel and Distributed Systems
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment
Journal of the ACM (JACM)
Indulgent algorithms (preliminary version)
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Distributed Algorithms
On Real-Time and Non Real-Time Distributed Computing
WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Proof-Based System Engineering and Embedded Systems
Lectures on Embedded Systems, European Educational Forum, School on Embedded Systems
he Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Consensus: The Big Misunderstanding
FTDCS '97 Proceedings of the 6th IEEE Workshop on Future Trends of Distributed Computing Systems
Consensus Based on Failure Detectors with a Perpetual Accuracy Property
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
A Protocol and Correctness Proofs for Real-Time High-Performance Broadcast Networks
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Asynchronous Protocols to Meet Real-Time Constraints: Is It Really Sensible? How to Proceed?
ISORC '98 Proceedings of the The 1st IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
The Timewheel Group Communication System
IEEE Transactions on Computers
The Timely Computing Base Model and Architecture
IEEE Transactions on Computers
On the Impact of Fast Failure Detectors on Real-Time Fault-Tolerant Systems
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Fast Scheduling of Distributable Real-Time Threads with Assured End-to-End Timeliness
Ada-Europe '08 Proceedings of the 13th Ada-Europe international conference on Reliable Software Technologies
Towards a real-time distributed computing model
Theoretical Computer Science
Recovering from distributable thread failures in distributed real-time Java
ACM Transactions on Embedded Computing Systems (TECS)
The failure detector abstraction
ACM Computing Surveys (CSUR)
Group communication: from practice to theory
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Failure detection with booting in partially synchronous systems
EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Novel generic middleware building blocks for dependable modular avionics systems
EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Proof-based system engineering using a virtual system model
ISAS'05 Proceedings of the Second international conference on Service Availability
Implementing reliable distributed real-time systems with the Θ-model
OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Dependable Systems
Hi-index | 14.99 |
We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the Uniform Consensus (UC) and the Real-Time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon Unreliable Failure Detectors. We introduce the concept of Fast Failure Detectors and we show that the problem of building Strong or Perfect Fast Failure Detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given, illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of Unreliable Failure Detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound, that is, (t+1)D, where t is the maximum number of processors that may crash and D is the maximum interprocess message delay. Finally, we introduce FastUC, a novel solution to UC, that is based upon Fast Failure Detectors. FastUC has a worst-case termination time that is sublinear in tD. For most practical cases and common values of t, FastUC terminates in D, making it a worst-case time optimal solution to Real-Time UC.