Host load prediction using linear models

  • Authors:
  • Peter A. Dinda;David R. O'Hallaron

  • Affiliations:
  • Department of Computer Science, Northwestern University, 1890 Maple Avenue, Evanston, IL 60201, USA;Computer Science Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA

  • Venue:
  • Cluster Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper evaluates linear models for predicting the Digital Unix five-second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine grain load traces from a variety of real machines leads to consideration of the Box–Jenkins models (AR, MA, ARMA, ARIMA), and the ARFIMA models (due to self-similarity.) We also consider a simple windowed-mean model. The computational requirements of these models span a wide range, making some more practical than others for incorporation into an online prediction system. We rigorously evaluate the predictive power of the models by running a large number of randomized testcases on the load traces and then data-mining their results. The main conclusions are that load is consistently predictable to a very useful degree, and that the simple, practical models such as AR are sufficient for host load prediction. We recommend AR(16) models or better for host load prediction. We implement an online host load prediction system around the AR(16) model and evaluate its overhead, finding that it uses miniscule amounts of CPU time and network bandwidth.