EntomoModel: Understanding and Avoiding Performance Anomaly Manifestations

Authors:
Christopher Stewart;Kai Shen;Arun Iyengar;Jian Yin
Affiliations:
-;-;-;-
Venue:
MASCOTS '10 Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Year:
2010

Citing 0
Cited 4

A capacity planning process for performance assurance of component-based distributed systems

Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
PAL: Propagation-aware Anomaly Localization for cloud hosted distributed applications

SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Understanding and detecting real-world performance bugs

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Zoolander: efficient latency management in NoSQL stores

Proceedings of the Posters and Demo Track

Quantified Score

Hi-index	0.00

Visualization

Abstract

Subtle implementation errors or mis-configurations in complex Internet services may lead to performance degradations without causing failures. These undiscovered performance anomalies afflict many of today’s systems, causing violations of service-level agreements (SLAs), unnecessary resource over provisioning, or both. In this paper, we re-inserted realistic anomaly causes into a multi-tier Internet service architecture and studied their manifestations. We observed that each cause had certain workload and management parameters that were more likely to trigger manifestations, hinting that such parameters could be effective classifiers. This observation held even when anomaly causes manifested differently in combination than in isolation. Our study motivates EntomoModel, a framework for depicting performance anomaly manifestations. EntomoModel uses decision tree classification and a design-driven performance model to characterize the workload and management policy settings under which manifestations are likely. EntomoModel enables online system management that avoids anomaly manifestations by dynamically adjusting system management parameters. Our trace-driven evaluations show that manifestation avoidance based on EntomoModel, or entomophobic management, can reduce 98th percentile SLA violations by 67% compared to an anomaly oblivious adaptive approach. In a cloud computing scenario with elastic resource allocation, our approach uses less than half of the resources needed in static over-provisioning.