Regression-based utilization prediction algorithms: an empirical investigation

  • Authors:
  • I. J. Davis;H. Hemmati;R. C. Holt;M. W. Godfrey;D. M. Neuse;S. Mankovskii

  • Affiliations:
  • University of Waterloo, Waterloo, Ontario, Canada;University of Manitoba, Winnipeg, Manitoba, Canada;University of Waterloo, Waterloo, Ontario, Canada;University of Waterloo, Waterloo, Ontario, Canada;CA Technologies;CA Technologies

  • Venue:
  • CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Predicting future behavior reliably and efficiently is vital for systems that manage virtual services. Such systems must be able to balance loads within a cloud environment to ensure that service level agreements (SLAs) are met at a reasonable expense. These virtual services while often comparatively idle are occasionally heavily utilized. Standard approaches to modeling system behavior (by analyzing the totality of the observed data, such as regression based approaches) tend to predict average rather than exceptional system behavior and may ignore important patterns of change over time. Consequently, such approaches are of limited use in providing warnings of future peak utilization within a cloud environment. Skewing predictions to better fit peak utilizations, results in poor fitting to low utilizations, which compromises the ability to accurately predict peak utilizations, due to false positives. In this paper, we present an adaptive approach that estimates, at run time, the best prediction value based on the performance of the previously seen predictions. This algorithm has wide applicability. We applied this adaptive technique to two large-scale real world case studies. In both studies, the results show that the adaptive approach is able to predict low, medium, and high utilizations more accurately than the other proposed approaches, at low cost, by adapting to changing patterns within the input time series. This facilitates better proactive management and placement of systems running within a cloud.