Automated analysis of performance and energy consumption for cloud applications
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Hi-index | 0.00 |
When running mission-critical web-facing applications (e.g., electronic commerce) in cloud environments, predictable response time, e.g., specified as service level agreements (SLA), is a major performance reliability requirement. Through extensive measurements of n-tier application benchmarks in a cloud environment, we study three factors that significantly impact the application response time predictability: bursty workloads (typical of web-facing applications), soft resource management strategies (e.g., global thread pool or local thread pool), and bursts in system software consumption of hardware resources (e.g., Java Virtual Machine garbage collection). Using a set of profit-based performance criteria derived from typical SLAs, we show that response time reliability is brittle, with large response time variations (order of several seconds) depending on each one of those factors. For example, for the same workload and hardware platform, different apparently reasonable soft resource management strategies may result in profit differences of 26\%. Similarly, modest increases in workload burstiness may result in profit drops of more than 50\%. Our study shows that performance reliability of large scale distributed applications is a significant and interesting research challenge. Furthermore, our results show that profit-based performance criteria may contribute significantly to the successful delimitation of performance unreliability boundaries and thus support effective management of clouds.