Recovery Techniques for Database Systems
ACM Computing Surveys (CSUR)
The Concept of Coverage and Its Effect on the Reliability Model of a Repairable System
IEEE Transactions on Computers
Derivation and Calibration of a Transient Error Reliability Model
IEEE Transactions on Computers
Fault tolerance by means of external monitoring of computer systems
AFIPS '81 Proceedings of the May 4-7, 1981, national computer conference
Reliability, availability, and serviceability of IBM computer systems: a quarter century of progress
IBM Journal of Research and Development
IBM 3081 processor unit: design considerations and design process
IBM Journal of Research and Development
Processor controller for the IBM 3081
IBM Journal of Research and Development
Automated diagnostic methodology for the IBM 3081 processor complex
IBM Journal of Research and Development
A fault-tolerant system architecture for navy applications
IBM Journal of Research and Development
Hi-index | 14.98 |
Research and development in fault-tolerant computing has shown that a dedicated processor, called a maintenance processor, can efficiently monitor, control, and maintain the operation of its host computer. This paper presents the general system structure and common functional capabilities of the maintenance processor, and illustrates its utilization with a survey of actual implementations available in the general-purpose computer industry. An analytical model is then presented to evaluate the impact of the maintenance processor on the host system reliability, availability, and serviceability (RAS). Examples given show negligible additional downtime and system failures due to the unavailability of a typical maintenance processor. This observation, plus others included in the paper, strongly indicate that a maintenance processor can be designed and used as the focal point of most system support activities. The approach simplifies the hardware and software structure of the host computer, and improves the total system RAS.