Optimization strategies in different CUDA architectures using llCoMP

  • Authors:
  • Ruymán Reyes;Francisco de Sande

  • Affiliations:
  • Dept. de E. I. O. y Computación, Universidad de La Laguna, 38271 La Laguna, Spain;Dept. de E. I. O. y Computación, Universidad de La Laguna, 38271 La Laguna, Spain

  • Venue:
  • Microprocessors & Microsystems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the current proliferation of GPU devices in HPC environments, scientist and engineers spend much of their time optimizing codes for these platforms. At the same time, manufactures produce new versions of their devices every few years, each one more powerful than the last. The question that arises is: is it optimization effort worthwhile? In this paper, we present a review of the different CUDA architectures, including Fermi, and optimize a set of algorithms for each using widely-known optimization techniques. This work would require a tremendous coding effort if done manually. However, using our fast prototyping tool, this is an effortless process. The result of our analysis will guide developers on the right path towards efficient code optimization. Preliminary results show that some optimizations recommended for older CUDA architectures may not be useful for the newer ones.