Automatic detection of performance deviations in the load testing of large scale systems

Authors:
Haroon Malik;Hadi Hemmati;Ahmed E. Hassan
Affiliations:
Queen's University, Canada;University of Waterloo, Canada;Queen's University, Canada
Venue:
Proceedings of the 2013 International Conference on Software Engineering
Year:
2013

Citing 9
Cited 0

Automated support for classifying software failure reports

Proceedings of the 25th International Conference on Software Engineering
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining

IEEE Transactions on Knowledge and Data Engineering
PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A framework for measurement based performance modeling

WOSP '08 Proceedings of the 7th international workshop on Software and performance
Automated analysis of load testing results

Proceedings of the 19th international symposium on Software testing and analysis
Using Load Tests to Automatically Compare the Subsystems of a Large Enterprise System

COMPSAC '10 Proceedings of the 2010 IEEE 34th Annual Computer Software and Applications Conference
Pinpointing the Subsystems Responsible for the Performance Deviations in a Load Test

ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Automatic Comparison of Load Tests to Support the Performance Analysis of Large Enterprise Systems

CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Automated Verification of Load Tests Using Control Charts

APSEC '11 Proceedings of the 2011 18th Asia-Pacific Software Engineering Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Load testing is one of the means for evaluating the performance of Large Scale Systems (LSS). At the end of a load test, performance analysts must analyze thousands of performance counters from hundreds of machines under test. These performance counters are measures of run-time system properties such as CPU utilization, Disk I/O, memory consumption, and network traffic. Analysts observe counters to find out if the system is meeting its Service Level Agreements (SLAs). In this paper, we present and evaluate one supervised and three unsupervised approaches to help performance analysts to 1) more effectively compare load tests in order to detect performance deviations which may lead to SLA violations, and 2) to provide them with a smaller and manageable set of important performance counters to assist in root-cause analysis of the detected deviations. Our case study is based on load test data obtained from both a large scale industrial system and an open source benchmark application. The case study shows, that our wrapper-based supervised approach, which uses a search-based technique to find the best subset of performance counters and a logistic regression model for deviation prediction, can provide up to 89% reduction in the set of performance counters while detecting performance deviations with few false positives (i.e., 95% average precision). The study also shows that the supervised approach is more stable and effective than the unsupervised approaches but it has more overhead due to its semi-automated training phase.