System Recovery through Dynamic Regeneration of Workflow Specification

  • Authors:
  • Casey K. Fung;Patrick C. K. Hung

  • Affiliations:
  • Boeing Phantom Works, USA;University of Ontario Institute of Technology, Canada

  • Venue:
  • ISORC '05 Proceedings of the Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed software systems are the basis for innovative applications (e.g., pervasive computing, telecommunication services, and Grid utility services). The key for achieving survivable and maintainable distributed systems is agility because otherwise the non-deterministic nature of distribution would leave the system uncontrollable. Survivability is defined as the capability of a service to fulfill its mission in a timely manner, even in the presence of attacks, failures, or accidents. Because of the severe consequences of failure, organizations are focusing on service survivability as a key risk management strategy for business processes. There are three key survivability properties: resistance, recognition, and recovery. Recovery, a hallmark of survivability, is the capability to maintain critical components and resource during attack, limit the extent of damage, and restore full services following attack. Exception handling is a way to deals with the recovery aspect of survivability. Business Process Execution Language for Web Services (BPEL) has been proposed for formal specification of business processes and interaction protocols. BPEL defines an interoperable integration model that facilitates expansion of automated process integration in both intra- and inter-corporate environments. A business process description requires the specification of both the normal flow and the possible variations due to exceptional situations that can be anticipate and monitored. This paper bridges the analysis of business process survivability and its recovery aspect in terms of exception handling in the context of BPEL. We propose an integrated approach to engineer a survivable distributed system through dynamic regeneration of workflow specifications when the system encounters attacks and failures.