Hardware-Related Software Errors: Measurement and Analysis

  • Authors:
  • R. K. Iyer;P. Velardi

  • Affiliations:
  • Computer Systems Group at the Coordinated Science Laboratory and the Department of Electrical and Computer Engineering, University of Illinois;-

  • Venue:
  • IEEE Transactions on Software Engineering
  • Year:
  • 1985

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an analysis of hardware-related software (HW/SW) errors on an MVS/SP operating system at Stanford University. The analysis procedure demonstrates a methodology for evaluating the interaction between hardware and software as it relates to system reliability. The paper examines the operating system's handling of HW/SW errors and also the effectiveness of recovery management. Nearly 35 percent of all observed software failures were found to be hareware-related. The analysis shows that the operating system is seldom able to diagnose that a software error may be hardware-related. The impact of HW/SW errors on the system is evaluated by measuring the effectiveness of system recovery in containing the propagation of HW/SW errors. The system failure probability for HW/SW errors is close to three times that for software errors in general. The observed HW/SW errors are seen to have a specific pattern, suggesting the possibility of the use of such error patterns for intelligent error prediction and recovery.