Job status prediction - catch them before they fail

  • Authors:
  • Igor Grudenic;Nikola Bogunovic

  • Affiliations:
  • Faculty of Electrical Engineering and Computing, University of Zagreb, Unska, Zagreb, Croatia;Faculty of Electrical Engineering and Computing, University of Zagreb, Unska, Zagreb, Croatia

  • Venue:
  • GPC'11 Proceedings of the 6th international conference on Advances in grid and pervasive computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Jobs in a computer cluster have several exit statuses caused by application properties, user and scheduler behavior. In this paper we analyze importance of job statuses and potential use of their prediction prior to job execution. Method for prediction of failed jobs based on Bayesian classifier is proposed and accuracy of the method is analyzed on several workloads. This method is integrated to the EASY algorithm adapted to prioritize jobs that are likely to fail. System performance for both failed jobs and the entire workload is analyzed.