The concept of “stability” in asynchronous distributeddecision-making systems

  • Authors:
  • T. S. Lee;S. Ghosh

  • Affiliations:
  • Vitrin Technol., Sunnyvale, CA;-

  • Venue:
  • IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Asynchronous distributed decision-making (ADDM) systems constitute a special class of distributed problems and are characterized as large, complex systems wherein the principal elements are the geographically dispersed entities that communicate among themselves, asynchronously, through message passing and are permitted autonomy in local decision making. Such systems generally offer significant advantages over the traditional, centralized algorithms in the form of concurrency, scalability, high throughput, efficiency, low vulnerability to catastrophic failures, and robustness. A fundamental property of ADDM systems is stability that refers to their behavior under representative perturbations to their operating environments, given that such systems are intended to be real, complex, and to some extent, mission-critical, and are subject to unexpected changes in their operating conditions. This paper introduces the concept of stability in ADDM systems and proposes an intuitive yet practical and usable definition that is inspired by those used in control systems and physics. An ADDM system is defined as a stable system if it returns to a steady state in finite time, following perturbation, provided that it is initiated in a steady state. Equilibrium or steady state is defined through placing bounds on the measured error in the system. Where the final steady state is equivalent to the initial one, a system is referred to as strongly stable. If the final steady state is potentially worse then the initial one, a system is deemed marginally stable. When a system fails to return to steady state following the perturbation, it is unstable. The perturbations are classified as either changes in the input pattern or changes in one or more environmental characteristics of the system, such as hardware failures. For a given ADDM system, the definitions are based on the performance indices that must be judiciously identified by the system architect and are likely to be unique. To facilitate the understanding of stability in representative real-world systems, this paper reports the analysis of two basic manifestations of ADDM systems that have been reported in the literature: (1) a decentralized military command and control problem, MFAD and (2) a novel distributed algorithm with soft reservation for efficient scheduling and congestion mitigation in railway networks, RYNSORD. Stability analysis of MFAD and RYNSORD yields key stable and unstable conditions. A system determined to be stable provides the reassurance that the system will perform well under adverse conditions. In contrast, a system deemed unstable reflects the need to address key weaknesses in the system design. Thus, stability analysis is a necessary and critical step in the development of any ADDM system