Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
Portable RK: A Portable Resource Kernel for Guaranteed and Enforced Timing Behavior
RTAS '99 Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium
MEAD: support for Real-Time Fault-Tolerant CORBA: Research Articles
Concurrency and Computation: Practice & Experience - Foundations of Middleware Technologies
Model-Based Development of Embedded Systems: The SysWeaver Approach
RTAS '06 Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium
Autonomous driving in urban environments: Boss and the Urban Challenge
Journal of Field Robotics - Special Issue on the 2007 DARPA Urban Challenge, Part I
RTSS '08 Proceedings of the 2008 Real-Time Systems Symposium
Adaptive Failover for Real-Time Middleware with Passive Replication
RTAS '09 Proceedings of the 2009 15th IEEE Symposium on Real-Time and Embedded Technology and Applications
Middleware for Resource-Aware Deployment and Configuration of Fault-Tolerant Real-time Systems
RTAS '10 Proceedings of the 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium
Cyber-physical systems: the next computing revolution
Proceedings of the 47th Design Automation Conference
SAFER: System-level Architecture for Failure Evasion in Real-time Applications
RTSS '12 Proceedings of the 2012 IEEE 33rd Real-Time Systems Symposium
Hi-index | 0.00 |
Advances in real-time, embedded and distributed systems along with control and communication theory have catalyzed the rapid emergence of cyber-physical systems such as a self-driving car. The importance of fault-tolerance support on a cyber-physical system (CPS) has been greatly emphasized by recent research due to the nature of CPS that senses its surroundings, processes sensor data, and reacts using its actuators. In order to tackle this challenge, we proposed SAFER (System-level Architecture for Failure Evasion in Real-time Applications) in our previous work. SAFER is able to tolerate fail-stop processor and/or task failures for distributed embedded real-time systems. One of its limitations, however, is that SAFER is not capable of tolerating a failure of a processor with a dedicated connection to an actuator. This paper provides a method that relaxes this limitation by (1) deploying a small piece of hardware to avoid a dedicated connection between a processor and an actuator, (2) adding a software module that monitors and controls the hardware, and (3) enhancing the failure detection and recovery mechanisms of SAFER to support these changes. The detailed implementation and evaluation of the SAFER extension is on-going work.