Host load prediction using linear models

Authors:
Peter A. Dinda;David R. O'Hallaron
Affiliations:
Department of Computer Science, Northwestern University, 1890 Maple Avenue, Evanston, IL 60201, USA;Computer Science Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
Venue:
Cluster Computing
Year:
2000

Citing 17
Cited 49

Adaptive load sharing in homogeneous distributed systems

IEEE Transactions on Software Engineering
The limited performance benefits of migrating active processes for load sharing

SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The available capacity of a privately owned workstation environment

Performance Evaluation
Prediction based task scheduling in distributed computing

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Exploiting process lifetime distributions for dynamic load balancing

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The performance of a service for network-aware applications

SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Load-balancing heuristics and process behavior

SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
Time Series Analysis: Forecasting and Control

Time Series Analysis: Forecasting and Control
Numerical Recipes: FORTRAN

Numerical Recipes: FORTRAN
The Case for Prediction-Based Best-Effort Real-Time Systems

Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Scheduling From the Perspective of the Application

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Forecasting network performance to support dynamic scheduling using the network weather service

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
A Resource Query Interface for Network-Aware Applications

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Predicting the CPU Availability of Time-Shared Unix Systems on the Computational Grid

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
An Evaluation of Linear Models for Host Load Prediction

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Resource signal prediction and its application to real-time scheduling advisors

Resource signal prediction and its application to real-time scheduling advisors
The statistical properties of host load

Scientific Programming

Online prediction of the running time of tasks

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Online Prediction of the Running Time of Tasks

Cluster Computing
Predicting the Performance of Wide Area Data Transfers

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Prediction-Based Real-Time Scheduling Advisor

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Adaptive Parameter Collection in Dynamic Distributed Environments

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Using model trees to characterize computer resource usage

WOSS '04 Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems
Using Regression Techniques to Predict Large Data Transfers

International Journal of High Performance Computing Applications
Design, Implementation, and Performance of an Extensible Toolkit for Resource Prediction in Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
Load prediction models in web-based systems

valuetools '06 Proceedings of the 1st international conference on Performance evaluation methodolgies and tools
Grid harvest service: a performance system of grid computing

Journal of Parallel and Distributed Computing
A prediction method for job runtimes on shared processors: Survey, statistical analysis and new avenues

Performance Evaluation
3D game content distributed adaptation in heterogeneous environments

EURASIP Journal on Advances in Signal Processing
Predict task running time in grid environments based on CPU load predictions

Future Generation Computer Systems
Data access history cache and associated data prefetching mechanisms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Performance under failures of high-end computing

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Models and framework for supporting runtime decisions in Web-based systems

ACM Transactions on the Web (TWEB)
A content-based load balancing algorithm with admission control for cluster web servers

Future Generation Computer Systems
Using historical accounting information to predict the resource usage of grid jobs

Future Generation Computer Systems
Load prediction using hybrid model for computational grid

GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Predicting Running Time of Grid Tasks based on CPU Load Predictions

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Improving Architecture-Based Self-Adaptation through Resource Prediction

Software Engineering for Self-Adaptive Systems
Platform-independent modeling and prediction of application resource usage characteristics

Journal of Systems and Software
Short-term prediction models for server management in Internet-based contexts

Decision Support Systems
Mixture of ANFIS systems for CPU load prediction in metacomputing environment

Future Generation Computer Systems
An approximation-based load-balancing algorithm with admission control for cluster web servers with dynamic workloads

The Journal of Supercomputing
Discovering Piecewise Linear Models of Grid Workload

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A predictive and probabilistic load-balancing algorithm for cluster-based web servers

Applied Soft Computing
Task profiling model for load profile prediction

Future Generation Computer Systems
The GHS grid scheduling system: implementation and performance comparison

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Adaps - A three-phase adaptive prediction system for the run-time of jobs based on user behaviour

Journal of Computer and System Sciences
A novel statistical time-series pattern based interval forecasting strategy for activity durations in workflow systems

Journal of Systems and Software
CPU load prediction using neuro-fuzzy and Bayesian inferences

Neurocomputing
Efficient dynamic task scheduling in virtualized data centers with fuzzy prediction

Journal of Network and Computer Applications
Strategies for Rescheduling Tightly-Coupled Parallel Applications in Multi-Cluster Grids

Journal of Grid Computing
A similarity measure for time, frequency, and dependencies in large-scale workloads

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Towards Non-Stationary Grid Models

Journal of Grid Computing
Host load prediction for grid computing using free load profiles

ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Decentralized proactive resource allocation for maximizing throughput of P2P Grid

Journal of Parallel and Distributed Computing
An adaptive model for online detection of relevant state changes in Internet-based systems

Performance Evaluation
Bandwidth variability prediction with rolling interval least squares (RILS)

Proceedings of the 50th Annual Southeast Regional Conference
Selective resource characterization for evaluation of system dynamics

ACM SIGMETRICS Performance Evaluation Review
An effective data aggregation based adaptive long term CPU load prediction mechanism on computational grid

Future Generation Computer Systems
A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers

Journal of Parallel and Distributed Computing
Resource utilization prediction: a proposal for information technology research

Proceedings of the 1st Annual conference on Research in information technology
Host load prediction in a Google compute cloud with a Bayesian model

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A pattern fusion model for multi-step-ahead CPU load prediction

Journal of Systems and Software
Regression-based utilization prediction algorithms: an empirical investigation

CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
Google hostload prediction based on Bayesian model with optimized feature combination

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper evaluates linear models for predicting the Digital Unix five-second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine grain load traces from a variety of real machines leads to consideration of the Box–Jenkins models (AR, MA, ARMA, ARIMA), and the ARFIMA models (due to self-similarity.) We also consider a simple windowed-mean model. The computational requirements of these models span a wide range, making some more practical than others for incorporation into an online prediction system. We rigorously evaluate the predictive power of the models by running a large number of randomized testcases on the load traces and then data-mining their results. The main conclusions are that load is consistently predictable to a very useful degree, and that the simple, practical models such as AR are sufficient for host load prediction. We recommend AR(16) models or better for host load prediction. We implement an online host load prediction system around the AR(16) model and evaluate its overhead, finding that it uses miniscule amounts of CPU time and network bandwidth.