Instruction-Level Fault Tolerance Configurability

  • Authors:
  • Demid Borodin;B. H. Juurlink;Said Hamdioui;Stamatis Vassiliadis

  • Affiliations:
  • Computer Engineering Laboratory, Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, Delft, The Netherlands 2628 CD;Computer Engineering Laboratory, Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, Delft, The Netherlands 2628 CD;Computer Engineering Laboratory, Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, Delft, The Netherlands 2628 CD;Computer Engineering Laboratory, Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, Delft, The Netherlands 2628 CD

  • Venue:
  • Journal of Signal Processing Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to modern technology trends such as decreasing feature sizes and lower voltage levels, fault tolerance (FT) is becoming increasingly important in computing systems. Several schemes have been proposed to enable a user to configure the FT at the application level, thereby enabling the user to trade stronger FT for performance or vice versa. In this paper, we propose supporting instruction-level rather than application-level configurability of FT, since different parts of some applications (e.g., multimedia) can have different reliability requirements. Weak or no FT will be applied to less critical parts, resulting in time and/or resource gains. These gains can be used to apply stronger FT techniques to the more critical parts; hence increasing the overall reliability. The paper shows how some existing FT techniques can be adapted to support instruction-level FT configurability, how a programmer can specify the desired FT level of the instructions, and how the compiler can manage it automatically. A comparison between the existing FT scheme EDDI (which duplicates all instructions) and the proposed approach is performed both at the kernel and at full application levels. The simulation results show that both the performance and the energy consumption are significantly improved (up to 50% at the kernel and up to 16% at full application level), while the fault coverage depends on the application. For the full application (JPEG encoder), our approach is only applied to one kernel in order to avoid increasing the programming effort significantly.