Predicting the performance of parallel programs

  • Authors:
  • V. Blanco;J. A. González;C. León;C. Rodríguez;G. Rodríguez;M. Printista

  • Affiliations:
  • Dpto. Estadística, I.O. y Computación, Universidad de La Laguna, 38271 Tenerife, La Laguna, Spain;Dpto. Estadística, I.O. y Computación, Universidad de La Laguna, 38271 Tenerife, La Laguna, Spain;Dpto. Estadística, I.O. y Computación, Universidad de La Laguna, 38271 Tenerife, La Laguna, Spain;Dpto. Estadística, I.O. y Computación, Universidad de La Laguna, 38271 Tenerife, La Laguna, Spain;Dpto. Estadística, I.O. y Computación, Universidad de La Laguna, 38271 Tenerife, La Laguna, Spain;Universidad Nacional de San Luis, Ejército de los Andes 950, San Luis, Argentina

  • Venue:
  • Parallel Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work presents a new approach to the relation between theoretical complexity models and performance analysis and tuning. The analysis of an algorithm produces a complexity function that gives an approach to the asymptotic number of operations performed by the algorithm. The time spent on these operations depends on the software-hardware platform being used. Usually such platforms are described, from the performance point of view, through a number of parameters. Those parameters are evaluated by a benchmarking program. Though for a given available platform, the algorithmic constants associated with the complexity formula can be computed using multidimensional linear regression, there is still the problem of predicting the performance when the platform is not available. We introduce the concept of Universal Instruction Class and derive from it a set of equations relating the values of the algorithmic constants with the platform parameters. Due to the hierarchical design of current memory systems, the performance behavior of most algorithms varies in a small number of large regions corresponding to small size, medium size and large size inputs. The constants involved in the complexity formula usually have different values for these regions. Assuming we have a complexity formula for the memory resources, it is possible to find a partition of the input size space and the different values of the algorithmic constants. This way, though the complexity formula is the same, the family of constants provides the adaptability of the formula to the different stationary uses of the memory.