Group communication protocol for flexible distributed systems

  • Authors:
  • H. Higaki;M. Takizawa

  • Affiliations:
  • -;-

  • Venue:
  • ICNP '96 Proceedings of the 1996 International Conference on Network Protocols (ICNP '96)
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

In large-scale distributed systems, the processes have to be upgraded to absorb the changes of user requirements and system environments. The system cannot be kept available by the conventional upgrading methods because multiple processes have to be suspended simultaneously. This paper discusses a new method where each process can invoke asynchronously the upgrading procedure. The key idea is that multiple versions of processes can be operated temporarily. Each pair of an old-version process and a new-version one are managed as one process group. The group communication protocol proposed supports the message transmission among the process groups. Moreover, the protocol detects protocol errors caused by the co-existence of multiple versions of processes. A checkpoint-rollback algorithm for resolving the protocol errors is proposed. By using the algorithm, the minimum number of processes are rolled back asynchronously. Hence, the system is highly available even if protocol error occurs.