DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
SWIFT: Software Implemented Fault Tolerance
Proceedings of the international symposium on Code generation and optimization
NonStop® Advanced Architecture
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Chip multiprocessing and the cell broadband engine
Proceedings of the 3rd conference on Computing frontiers
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection
Proceedings of the International Symposium on Code Generation and Optimization
Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Taming subsystems: capabilities as universal resource access control in L4
Proceedings of the Second Workshop on Isolation and Integration in Embedded Systems
AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware
SAFECOMP '09 Proceedings of the 28th International Conference on Computer Safety, Reliability, and Security
Architecture Design for Soft Errors
Architecture Design for Soft Errors
FlexSC: flexible system call scheduling with exception-less system calls
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A case for NUMA-aware contention management on multicore systems
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
SCC: A Flexible Architecture for Many-Core Platform Research
Computing in Science and Engineering
Operating system support for redundant multithreading
Proceedings of the tenth ACM international conference on Embedded software
Hi-index | 0.00 |
We present the design and initial evaluation of a resilient operating system architecture that leverages HW architectures combining few resilient with many nonresilient CPU cores. To this end, we build our system around a Reliable Computing Base (RCB) consisting of those software components that must work for reliable operation, and run the RCB on the resilient cores. The remainder of the system runs replicated on unreliable cores. Our system's RCB consists of an L4 microkernel, a runtime environment and a replication manager. In this paper we state and justify assumptions about the hardware architecture, motivate the corresponding software architecture and evaluate communication mechanisms between the RCB and the replicas.