Parallel Performance Modeling using a Genetic Programming-based Error Correction Procedure

  • Authors:
  • Kavitha Raghavachar;G. Mahinthakumar;Patrick Worley;Emily Zechman;Ranji Ranjithan

  • Affiliations:
  • North Carolina State University;North Carolina State University;Oak Ridge National Laboratory;TeXas A&M University;North Carolina State University

  • Venue:
  • Simulation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Performance models of high performance computing (HPC) applications are important for several reasons. First, they provide insight to designers of HPC systems on the role of subsystems such as the processor or the network in determining application performance. Second, they allow HPC centers more accurately to target procurements to resource requirements. Third, they can be used to identify application performance bottlenecks and to provide insights about scalability issues. The suitability of a performance model, however, for a particular performance investigation is a function of both the accuracy and the cost of the model. A semi-empirical model previously published by the authors for an astrophysics application was shown to be inaccurate when predicting communication cost for large numbers of processors. It is hypothesized that this deficiency is due to the inability of the model adequately to capture communication contention (threshold effects) as well as other unmodeled components such as noise and I/O contention. In this paper we present a new approach to capture these unknown features to improve the predictive capabilities of the model. This approach uses a systematic model error-correction procedure that uses evolutionary algorithms to find an error correction term to augment the eXisting model. Four variations of this procedure were investigated and all were shown to produce better results than the original model. Successful cross-platform application of this approach showed that it adequately captures machine dependent characteristics. This approach was then successfully demonstrated for a second application, further showing its versatility.