Monere: monitoring of service compositions for failure diagnosis

Authors:
Bruno Wassermann;Wolfgang Emmerich
Affiliations:
University College London, London, UK;University College London, London, UK
Venue:
ICSOC'11 Proceedings of the 9th international conference on Service-Oriented Computing
Year:
2011

Citing 15
Cited 1

World wide failures

EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Java Management Extensions

Java Management Extensions
Pinpoint: Problem Determination in Large, Dynamic Internet Services

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Making Distributed Applications Manageable Through Instrumentation

PDSE '97 Proceedings of the 2nd International Workshop on Software Engineering for Parallel and Distributed Systems
Adding High Availability and Autonomic Behavior to Web Services

Proceedings of the 26th International Conference on Software Engineering
Network-Based Problem Detection for Distributed Systems

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
SysProf: Online Distributed Behavior Diagnosis through Fine-grain System Monitoring

ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Dynamic instrumentation of production systems

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Using magpie for request extraction and workload modelling

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Co-designing the failure analysis and monitoring of large-scale systems

ACM SIGMETRICS Performance Evaluation Review
Service-Level Agreements for Electronic Services

IEEE Transactions on Software Engineering
Lightweight, high-resolution monitoring for troubleshooting production systems

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
vPath: precise discovery of request processing paths from black-box observations of thread and network activities

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Self-Supervising BPEL Processes

IEEE Transactions on Software Engineering
Event driven monitoring for service composition infrastructures

WISE'10 Proceedings of the 11th international conference on Web information systems engineering

Apprehensive QoS monitoring of Service choreographies

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Service-oriented computing has enabled developers to build large, cross-domain service compositions in a more routine manner. These systems inhabit complex, multi-tier operating environments that pose many challenges to their reliable operation. Unanticipated failures at runtime can be time-consuming to diagnose and may propagate across administrative boundaries. It has been argued that measuring readily available data about system operation can significantly increase the failure management capabilities of such systems. We have built an online monitoring system for cross-domain Web service compositions called Monere, which we use in a controlled experiment involving human operators in order to determine the effects of such an approach on diagnosis times for system-level failures. This paper gives an overview of how Monere is able to instrument relevant components across all layers of a service composition and to exploit the structure of BPEL workflows to obtain structural cross-domain dependency graphs. Our experiments reveal a reduction in diagnosis time of more than 20%. However, further analysis reveals this benefit to be dependent on certain conditions, which leads to insights about promising directions for effective support of failure diagnosis in large Web service compositions.