Diagnosing architectural run-time failures

  • Authors:
  • Paulo Casanova;David Garlan;Bradley Schmerl;Rui Abreu

  • Affiliations:
  • CMU, USA;CMU, USA;CMU, USA;University of Porto, Portugal

  • Venue:
  • Proceedings of the 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Self-diagnosis is a fundamental capability of self-adaptive systems. In order to recover from faults, systems need to know which part is responsible for the incorrect behavior. In previous work we showed how to apply a design-time diagnosis technique at run time to identify faults at the architectural level of a system. Our contributions address three major shortcomings of our previous work: 1) we present an expressive, hierarchical language to describe system behavior that can be used to diagnose when a system is behaving different to expectation; the hierarchical language facilitates mapping low level system events to architecture level events; 2) we provide an automatic way to determine how much data to collect before an accurate diagnosis can be produced; and 3) we develop a technique that allows the detection of correlated faults between components. Our results are validated experimentally by injecting several failures in a system and accurately diagnosing them using our algorithm.