Holistic schedulability analysis for distributed hard real-time systems
Microprocessing and Microprogramming - Parallel processing in embedded real-time systems
Transparent Redundancy in the Time-Triggered Architecture
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Avoiding the Babbling-Idiot Failure in a Time-Triggered Communication System
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
The Central Guardian Approach to Enforce Fault Isolation in the Time-Triggered Architecture
ISADS '03 Proceedings of the The Sixth International Symposium on Autonomous Decentralized Systems (ISADS'03)
Fault Containment and Error Detection in the Time-Triggered Architecture
ISADS '03 Proceedings of the The Sixth International Symposium on Autonomous Decentralized Systems (ISADS'03)
Design Optimization of Time-and Cost-Constrained Fault-Tolerant Distributed Embedded Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
The Time-Triggered Ethernet (TTE) Design
ISORC '05 Proceedings of the Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
Embedded System Design
The Startup Problem in Fault-Tolerant Time-Triggered Communication
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Timing Analysis of the FlexRay Communication Protocol
ECRTS '06 Proceedings of the 18th Euromicro Conference on Real-Time Systems
Bus access optimisation for FlexRay-based distributed embedded systems
Proceedings of the conference on Design, automation and test in Europe
Performance analysis of FlexRay-based ECU networks
Proceedings of the 44th annual Design Automation Conference
Assessment of Message Missing Failures in FlexRay-Based Networks
PRDC '07 Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing
Fault Effects in FlexRay-Based Networks with Hybrid Topology
ARES '08 Proceedings of the 2008 Third International Conference on Availability, Reliability and Security
Investigation and Reduction of Fault Sensitivity in the FlexRay Communication Controller Registers
SAFECOMP '08 Proceedings of the 27th international conference on Computer Safety, Reliability, and Security
Categorizing and Analysis of Activated Faults in the FlexRay Communication Controller Registers
ETS '09 Proceedings of the 2009 European Test Symposium
A Low-Cost On-Line Monitoring Mechanism for the FlexRay Communication Protocol
LADC '09 Proceedings of the 2009 Fourth Latin-American Symposium on Dependable Computing
A Membership Service for a Distributed, Embedded System Based on a Time-Triggered FlexRay Network
EDCC '10 Proceedings of the 2010 European Dependable Computing Conference
Optimizations of an application-level protocol for enhanced dependability in FlexRay
Proceedings of the Conference on Design, Automation and Test in Europe
Classification of Activated Faults in the FlexRay-Based Networks
Journal of Electronic Testing: Theory and Applications
Hi-index | 0.00 |
Nowadays, distributed embedded systems are employed in many safety-critical applications such as X-by-Wire. These systems are composed of several nodes interconnected by a network. Studies show that a transient fault in the communication controller of a network node can lead to errors in the fault site node (called original errors) and/or in the neighbor nodes (called follow-up errors). The communication controller of a network node can be halted due to an error, which may be a follow-up error. In this situation, a follow-up error leads to halt the correct operation of a fault-free controller while the fault site node, i.e. the faulty controller, still continues its operation. In this paper, an analysis shows that the occurrence probability of follow-up errors in communication protocols is noticeable. Consequently, it is important to provide a technique to recognize the error's nature, i.e. original or follow-up in each node. This paper proposes a novel low-cost monitoring technique to differentiate follow-up errors from original errors. The proposed technique is based on monitoring the operational states of a communication controller. In this paper, this technique has been applied to the FlexRay protocol. However, it is applicable for all communication protocols having an FSM-based description such as FlexRay, TTP/C, and TT-Ethernet. To evaluate the monitoring technique, a FlexRay-based network including 4 nodes was designed and implemented. The low-cost monitoring technique was as well implemented inside each node of the network. A total of 135,600 transient bit-flip faults were injected in the communication controller of one node. The results showed that about 6.0% of injected faults lead to original errors. This figure for follow-up errors was about 6.1%. The results as well showed that the accuracy of the proposed technique to differentiate between the follow-up and original errors is about 97% at merely 1.4% hardware overhead. This level of accuracy and cost makes the proposed technique a feasible solution to enhance the reliability of communication controllers.