Managing redundancy in CAN-based networks supporting N-Version Programming

Authors:
Julián Proenza;José Miro-Julia;Hans Hansson
Affiliations:
Universitat de les Illes Balears, Departament de Matemítiques i Informítica, Campus Universitari, 07122 Palma de Mallorca, Spain;Universitat de les Illes Balears, Departament de Matemítiques i Informítica, Campus Universitari, 07122 Palma de Mallorca, Spain;Mälardalen University, Mälardalen Real-Time Research Centre, S-721 23, Västerås, Sweden
Venue:
Computer Standards & Interfaces
Year:
2009

Citing 12
Cited 0

The consensus problem in fault-tolerant computing

ACM Computing Surveys (CSUR)
Reaching Agreement in the Presence of Faults

Journal of the ACM (JACM)
Real-Time Systems: Design Principles for Distributed Embedded Applications

Real-Time Systems: Design Principles for Distributed Embedded Applications
Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism

Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism
Dependability: Basic Concepts and Terminology

Dependability: Basic Concepts and Terminology
Delta Four: A Generic Architecture for Dependable Distributed Computing

Delta Four: A Generic Architecture for Dependable Distributed Computing
Fault Tolerance: Why Should I Pay for It?

Revised Papers from a Workshop on Hardware and Software Architectures for Fault Tolerance
Difficulties Measuring Software Risk in an Industrial Environment

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Estimating Bounds on the Reliability of Diverse Systems

IEEE Transactions on Software Engineering
Fault-Tolerant Broadcasts in CAN

FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
A Columbus' Egg Idea for CAN Media Redundancy

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
The N-Version Approach to Fault-Tolerant Software

IEEE Transactions on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software is a major source of reliability degradation in dependable systems. One of the classical remedies is to provide software fault tolerance by using N-Version Programming (NVP). However, due to requirements on non-standard hardware and the need for changes and additions at all levels of the system, NVP solutions are costly, and have only been used in special cases. In a previous work, a low-cost architecture for NVP execution was developed. The key features of this architecture are the use of off-the-shelf components including communication standards and that the fault tolerance functionality, including voting, error detection, fault-masking, consistency management, and recovery, is moved into a separate redundancy management circuitry (one for each redundant computing node). In this article we present an improved design of that architecture, specifically resolving some potential inconsistencies that were not treated in detail in the original design. In particular, we present novel techniques for enforcing replica determinism. Our improved architecture is based on using the Controller Area Network (CAN). This choice goes beyond the obvious interest of using standards in order to reduce the cost, since all the rest of the architecture is designed to take full advantage of the CAN standard features, such as data consistency, in order to significantly reduce the complexity, the efficiency and the cost of the resultant system. Although initially developed for NVP, our redundancy management circuitry also supports other software replication techniques, such as active replication.