Runtime Verification
Using status messages in the distributed test architecture
Information and Software Technology
Runtime Verification for LTL and TLTL
ACM Transactions on Software Engineering and Methodology (TOSEM)
When the requirements for adaptation and high integrity meet
Proceedings of the 8th workshop on Assurances for self-adaptive systems
Monitoring of real-time properties
FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Processing flows of information: From data stream to complex event processing
ACM Computing Surveys (CSUR)
RV'11 Proceedings of the Second international conference on Runtime verification
Runtime verification of service-oriented systems: a well-rounded survey
International Journal of Web and Grid Services
Runtime verification of microcontroller binary code
Science of Computer Programming
Hi-index | 0.00 |
Reactive distributed systems have pervaded everyday life and objects, but often lack measures to ensure adequate behaviour in the presence of unforeseen events or even errors at runtime. As interactions and dependencies within distributed systems increase, the problem of detecting failures which depend on the exact situation and environment conditions they occur in grows. As a result, not only the detection of failures is increasingly difficult, but also the differentiation between the symptoms of a fault, and the actual fault itself, i. e., the cause of a problem.In this paper, we present a novel and efficient approach for analysing reactive distributed systems at runtime, in that we provide a framework for detecting failures as well as identifying their causes. Our approach is based upon monitoring safety-properties, specified in the linear time temporal logic LTL (respectively, TLTL) to automatically generate monitor components which detect violations of these properties. Based on the results of the monitors, a dedicated diagnosis is then performed in order to identify explanations for the misbehaviour of a system. These may be used to store detailed log files, or to trigger recovery measures. Our framework is built modular, layered, and uses merely a minimal communication overhead-especially when compared to other, similar approaches. Further, we sketch first experimental results from our implementations, and describe how it can be used to build a variety of distributed systems using our techniques.