Research: Supporting fault-tolerant and open distributed processing using RPC

Authors:
Wanlei Zhou
Affiliations:
School of Computing and Mathematics, Deakin University, Geelong, VIC 3217, Australia
Venue:
Computer Communications
Year:
1996

Citing 16
Cited 1

Atomic Remote Procedure Call

IEEE Transactions on Software Engineering
Reliable communication in the presence of failures

ACM Transactions on Computer Systems (TOCS)
Fault-tolerant software

Fault-tolerant computing: theory and techniques; Vol. 2
A Remote Procedure Call Facility for Interconnecting Heterogeneous Computer Systems

IEEE Transactions on Software Engineering
Distributed programming in Argus

Communications of the ACM
Network computing system reference manual

Network computing system reference manual
Network computing architecture

Network computing architecture
Lightweight causal and atomic group multicast

ACM Transactions on Computer Systems (TOCS)
Client-server computing

Communications of the ACM
Open systems interconnection (rev. ed.): its architecture and protocols

Open systems interconnection (rev. ed.): its architecture and protocols
A rapid prototyping system for distributed information system applications

Journal of Systems and Software
Implementing remote procedure calls

ACM Transactions on Computer Systems (TOCS)
Introduction to Program Fault Tolerance

Introduction to Program Fault Tolerance
Designing an Agent Synthesis System for Cross-RPC Communication

IEEE Transactions on Software Engineering
The N-Version Approach to Fault-Tolerant Software

IEEE Transactions on Software Engineering
System structure for software fault tolerance

IEEE Transactions on Software Engineering

Detecting and tolerating failures in a loosely integrated heterogeneous database system

Computer Communications

Quantified Score

Hi-index	0.24

Visualization

Abstract

This paper is concerned mainly with the software aspects of achieving reliable operations in an open distributed processing environment. A system for supporting fault-tolerant and cross-transport protocol distributed software development is described. The fault-tolerant technique used is a variation of the recovery blocks and the distributed computing model used is the remote procedure call (RPC) model. The system incorporates fault tolerance features and cross-transport protocol communication features into the RPC system and makes them transparent to users. A buddy is set up for a fault-tolerant server to be its alternative. When an RPC to a server fails, the system will automatically switch to the buddy to seek for an alternate service. The client, the fault-tolerant server and the buddy of the server can all use a different transport protocol. To obtain this fault tolerance and cross-protocol service, users only need to specify their requirements in a descriptive interface definition language. All the maintenance of fault tolerance and the cross-protocol communication is managed by the system in a user transparent manner. By using our system, users will have confidence in their distributed programs without bothering the fault tolerance and cross-protocol communication details. Our system is small, simple, easy to use and also has the advantage of producing server and client driver programs, and finally, executable programs directly from the server definition files.