Quantified Score

Hi-index 4.13

Visualization

Abstract

A taxonomy of fault tolerance in commercial computers is set forth. It is organized around three orthogonal axes: the sources of errors the computer tolerates, the computer's approach to tolerating errors, and the computer's structure. Each of these is briefly discussed. An example of each class in the taxonomy is presented, as well as its approach to answering the following questions: (1) Is the system to be highly reliable or highly available? (2) Do all outputs have to be correct, or only data committed to long-term storage? (3) How familiar must the user be with the architecture and software redundancy? (4) Is the system dedicated so that attributes of the application can be used to simplify fault tolerance techniques? (5) Is the system constrained to use existing components? (6) Even if the design is new, what cost and/or performance penalty does it impose on the user who does not require fault tolerance? (7) Is the system stand-alone, or can other processors be called upon to assist in times of failure? The computers covered are the VAX 8600 and IBM 3090 uniprocessors, the Tandem, Stratus, and VAXft 3000 multicomputers, and the Teradata and Sequoia multiprocessors.