Generic Timing Fault Tolerance using a Timely Computing Base

  • Authors:
  • Antonio Casimiro;Paulo Verissimo

  • Affiliations:
  • -;-

  • Venue:
  • DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Designing applications with timeliness requirements in environments of uncertain synchrony is known to be a difficult problem. In this paper, we follow the perspective of timingfault tolerance: timing errors occur, and they are processed using redundancy, e.g., component replication, to recover and deliver timely service. We introduce a paradigm for generic timing fault tolerance with replicated state machines. The paradigm is based on the existence of Timing Failure Detection with timed completeness and accuracy properties.Generic timing fault tolerance implies the ability to dependably observe the system and to timely notify timing failures, which we discuss in the paper. On the other hand, it ensures replica determinism with respect to time (temporal consistency), and safety in case of spare exhaustion. We show that the paradigm can be addressed and realized in the framework of the Timely Computing Base (TCB) model and architecture. Furthermore, we illustrate the generality of our approach by reviewing previous existing solutions and by showing that in contrast with ours, they only secure a restricted semantics, or simply provide ad-hoc solutions.