Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach
IEEE Transactions on Parallel and Distributed Systems
The Vision of Autonomic Computing
Computer
Queueing Model Based Network Server Performance Control
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Power-aware QoS Management in Web Servers
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
An Overview of the Runtime Verification Tool Java PathExplorer
Formal Methods in System Design
STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support
LISA '03 Proceedings of the 17th USENIX conference on System administration
Event-based runtime verification of java programs
WODA '05 Proceedings of the third international workshop on Dynamic analysis
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Minerals: using data mining to detect router misconfigurations
Proceedings of the 2006 SIGCOMM workshop on Mining network data
I/O system performance debugging using model-driven anomaly characterization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Automatic misconfiguration troubleshooting with peerpressure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5
RTSS '07 Proceedings of the 28th IEEE International Real-Time Systems Symposium
Semantic-Driven Model Composition for Accurate Anomaly Diagnosis
ICAC '08 Proceedings of the 2008 International Conference on Autonomic Computing
Guided Problem Diagnosis through Active Learning
ICAC '08 Proceedings of the 2008 International Conference on Autonomic Computing
Discovering Likely Invariants of Distributed Transaction Systems for Autonomic System Management
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Hi-index | 0.00 |
In this paper, we design, implement and evaluate AdaptGuard, a software service for guarding adaptive systems, such as QoS-adaptive servers, from instability caused by software anomalies and faults. Adaptive systems are of growing importance due to the need to adjust performance to a larger range of changing environmental conditions without human intervention. Such systems, however, implicitly assume a model of system behavior that may be violated, causing adaptation loops to perform poorly or fail. The purpose of AdaptGuard is simple: in the absence of an a priori model of the adaptive software system, anticipate system instability, attribute it correctly to the right "runaway" adaptation loop, and disconnect it, replacing it with conservative but stable open-loop control until further notice. We evaluate AdaptGuard by injecting various software faults into adaptive systems that are managed by typical adaptation loops. Results demonstrate that it can successfully anticipate instability caused by the injected faults and recover from performance degradation. Further, a case study is presented using an Apache Web server serving multiple classes of traffic. A performance anomaly is demonstrated, caused by unexpected interactions between an admission controller and the Linux anti-livelock mechanism. In the absence of a model that describes this mechanism, AdaptGuard is able to correctly attribute the unexpected problem to the right runaway loop and fix it.