Applying neural networks to performance estimation of embedded software

  • Authors:
  • Marcio Seiji Oyamada;Felipe Zschornack;Flávio Rech Wagner

  • Affiliations:
  • Universidade Federal do Rio Grande do Sul, Instituto de Informática, Porto Alegre, RS, Brazil and Universidade Estadual do Oeste do Paraná, Colegiado de Informática, Cascavel, PR, B ...;Universidade Federal do Rio Grande do Sul, Instituto de Informática, Porto Alegre, RS, Brazil;Universidade Federal do Rio Grande do Sul, Instituto de Informática, Porto Alegre, RS, Brazil

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-level performance estimation of embedded software implemented in a particular processor is essential for a fast design space exploration, when the designer needs to evaluate different processor architectures (and their different versions) and also different task allocations in a multiprocessor system. The development of fast and adequate performance estimators is required to achieve the necessary speed in this design phase. However, advanced architectures present many features, such as pipelines, branch prediction mechanisms, and caches, which have a non-linear impact on the execution time, which thus becomes hard to evaluate using simple linear methods. In order to cope with this problem, this paper presents a high-level performance estimator based on a neural network, which easily adapts to the non-linear behaviour of the execution time in advanced architectures and presents a speed-up up to 190 times in comparison with cycle-accurate simulators, using the PowerPC 750 as target architecture. A method for automatic domain classification is proposed to group applications with similar characteristics, resulting in an increase of the estimation precision. For the PowerPC 750, the mean estimation error has been reduced from 7.90% to 6.41% thanks to domain-specific estimators. This precision level and the fast estimation time are suitable for high-level design space exploration, when different architectures or processor versions and different task allocations need to be evaluated in a fast way.