Resilient multi-core systems: a hierarchical formal model for N-variant executions

  • Authors:
  • Axel Krings;Li Tan;Clinton Jeffery;Robert Rinker

  • Affiliations:
  • University of Idaho, Moscow, ID;Washington State University, Richland, WA;University of Idaho, Moscow, ID;University of Idaho, Moscow, ID

  • Venue:
  • Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research: Cyber Security and Information Intelligence Challenges and Strategies
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This research presents a hierarchical formal model capable of providing adjustable levels of service and quality of assurance, which is especially suitable for multi-core processor systems. The multi-layered architecture supports multiple levels of fault detection, masking, and dynamic load balancing. Unlike traditional fault-tolerant architectures that treat service requirements uniformly, each layer of the assured architecture implements a different level of services and information assurances. The system achieves load balancing by moving between layers of different complexity. Functionalities at different layers range from essential services necessary to satisfy the most stringent requirements for information assurance and system survivability at the lowest layer, to increasingly sophisticated functionalities with extended capabilities and complexity at higher layers. Low-layer functionalities can be used to monitor the behavior of high-layer functionalities. At each layer of the assured architecture, N-variant implementations make efficient use of multi-core hardware. The degree of the introduced redundancy in each layer determines the mix of faults that can be tolerated. The use of hybrid fault models allows us to consider fault types ranging from benign faults to Byzantine faults. Our framework extends recent work in N-variant systems for intrusion detection, which are demonstrated to be special cases. Furthermore, it allows the movement in a tradeoff space between (1) the levels of assurance provided at different layers, (2) the levels of redundancy used at specific layers, which determine the fault types that can be tolerated, and (3) the desired run-time overhead.