Automated and Adaptive Threshold Setting: Enabling Technology for Autonomy and Self-Management

  • Authors:
  • David Breitgand;Ealan Henis;Onn Shehory

  • Affiliations:
  • -;-;-

  • Venue:
  • ICAC '05 Proceedings of the Second International Conference on Automatic Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Threshold violations reported for system components signal undesirable conditions in the system. In complex computer systems, characterized by dynamically changing workload patterns and evolving business goals, the precomputed performance thresholds on the operational values of performance metrics of individual system components are not available. This paper focuses on a fundamental enabling technology for performance management: automatic computation and adaptation of statistically meaningful performance thresholds for system components. We formally define the problem of adaptive threshold setting with controllable accuracy of the thresholds and propose a novel algorithm for solving it. Given a set of Service Level Objectives (SLOs) of the applications executing in the system, our algorithmcontinually adapts the per-component performance thresholds to the observed SLO violations. The purpose of this continual threshold adaptation is to control the average amounts of false positive and false negative alarms to improve the efficacy of the threshold-based management. We implemented the proposed algorithm and applied it to a relatively simple, albeit non-trivial, storage system. In our experiments we achieved a positive predictive value of 92% and a negative predictive value of 93% for component level performance thresholds.