Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
A Fault-Tolerant Scheduling Algorithm for Real-Time Periodic Tasks with Possible Software Faults
IEEE Transactions on Computers
Design Considerations in Boeing 777 Fly-By-Wire Computers
HASE '98 The 3rd IEEE International Symposium on High-Assurance Systems Engineering
A 1.3GHz fifth generation SPARC64 microprocessor
Proceedings of the 40th annual Design Automation Conference
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
SWIFT: Software Implemented Fault Tolerance
Proceedings of the international symposium on Code generation and optimization
Design and Evaluation of Hybrid Fault-Detection Systems
Proceedings of the 32nd annual international symposium on Computer Architecture
Compiler-guided register reliability improvement against soft errors
Proceedings of the 5th ACM international conference on Embedded software
Exploration of distributed shared memory architectures for NoC-based multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
A Transient-Resilient System-on-a-Chip Architecture with Support for On-Chip and Off-Chip TMR
EDCC-7 '08 Proceedings of the 2008 Seventh European Dependable Computing Conference
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Segment gating for static energy reduction in Networks-on-Chip
Proceedings of the 2nd International Workshop on Network on Chip Architectures
The use of triple-modular redundancy to improve computer reliability
IBM Journal of Research and Development
Redundancy management technique for space shuttle computers
IBM Journal of Research and Development
Exhaustive testing of safety critical Java
Proceedings of the 8th International Workshop on Java Technologies for Real-Time and Embedded Systems
COTS-based applications in space avionics
Proceedings of the Conference on Design, Automation and Test in Europe
Analysis and optimization of fault-tolerant embedded systems with hardened processors
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Performance and time to market requirements cause many real-time designers to consider components, off the shelf (COTS) for real-time cyber-physical systems. Massive multi-core embedded processors with network-on-chip (NoC) designs to facilitate core-to-core communication are becoming common in COTS. These architectures benefit real-time scheduling, but they also pose predictability challenges. In this work, we develop a framework for Fault Observant and Correcting Real-Time Embedded design (Forte) that utilizes massive multi-core NoC designs to reduce overhead by up to an order of magnitude and to lower jitter in systems via utilizing message passing instead of shared memory as the means for intra-processor communication. Message passing, which is shown to improve the overall scalability of the system, is utilized as the basis for replication and task rejuvenation. This improves fault resilience by orders of magnitude. To our knowledge, this work is the first to systematically map real-time tasks onto massive multi-core processors with support for fault tolerance that considers NoC effects on scalability on an real hardware platform and not just in simulation.