Automated revision of distributed and real-time programs

  • Authors:
  • Sandeep S. Kalkarni;Borzoo Bonakdarpour

  • Affiliations:
  • Michigan State University;Michigan State University

  • Venue:
  • Automated revision of distributed and real-time programs
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This dissertation concentrates on the problem of automated revision of distributed and real-time programs that are correct-by-construction. In particular, our research addresses the following question: "if an existing program fails to satisfy a property, is it feasible to automatically revise the program inside its current state space and set of transitions, so that the revised program satisfies the failed property while it continues to satisfy its current properties?" We study this problem in two broad contexts: (1) revision in closed systems where programs do not interact with the environment, and (2) revision in open systems where programs are subject to a set of uncontrollable faults imposed by the environment. We refer to the former problem as "addition of a property to the input program" and the latter as "addition of fault-tolerance to the input program". We classify our results into three types: (1) polynomial-time sound and complete algorithms, (2) hardness results, and (3) sound efficient heuristics. Throughout this dissertation, we focus on three types of programs: (1) untimed centralized, (2) untimed distributed, and (3) centralized real-time. The reason for omitting distributed real-time programs is due to the fact that the structure of such programs are very complex and, hence, their formal analysis involves highly complex decision procedures. Thus, it is more beneficial to study the effect of the notions of distribution and time on programs separately in order to identify the stumbling blocks. Regarding addition of properties to programs in closed systems, we focus on UNITY safety and progress properties. Our interest in UNITY properties is due to the fact that they have been found highly expressive in specifying a large class of programs. Regarding addition of fault-tolerance to existing fault-intolerant programs, we consider three levels of fault-tolerance, namely failsafe, nonmasking, and masking, based on satisfaction of safety and liveness properties in the presence of faults. In order to capture time-related behaviors of programs in the presence of faults, we consider two additional levels, namely soft and hard, based on satisfaction of timing constraints in the presence of faults. We address some of the implementation difficulties using BDD-based heuristics for revising programs in both closed and open systems with respect to safety and progress properties. Our experimental results on synthesis of a variety of distributed programs show a significant performance improvement by several orders of magnitude in terms of time and space. We also introduce distributed and parallel techniques to improve the performance of our revision methods even further. Finally, we introduce our tool SYCRAFT which is capable of adding fault-tolerance to moderate-sized fault-intolerant distributed programs. In summary, this dissertation concludes that automated revision of moderate-sized programs (reachable states of size 1050 and beyond) is feasible in both theory and practice.