Comparing Logs and Models of Parallel Workloads Using the Co-plot Method

  • Authors:
  • David Talby;Dror G. Feitelson;Adi Raveh

  • Affiliations:
  • -;-;-

  • Venue:
  • IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a multivariate analysis technique called Co-plot that is especially suitable for samples with many variables and relatively few observations, as the data about workloads often is. Observations and variables are analyzed simultaneously. We find three stable clusters of highly correlated variables, but that the workloads themselves, on the other hand, are rather different from one another. Synthetic models for workload generation are also analyzed, and found to be reasonable; however, each model usually covers well one machine type. This leads us to conclude that a parameterized model of parallel workloads should be built, and we describe guidelines for such a model. Another feature that the models lack is self-similarity: We demonstrate that production logs exhibit this phenomenon in several attributes of the workload, and in contrast that the none of the synthetic models do.