Memory Bandwidth: The True Bottleneck of SIMD Multimedia Performance on a Superscalar Processor

  • Authors:
  • Julien Sébot;Nathalie Drach-Temam

  • Affiliations:
  • -;-

  • Venue:
  • Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the performance of DSP, image and 3D applications on recent general-purpose microprocessors using streaming SIMD ISA extensions (integer and floating point). The 9 benchmarks benchmark we use for this evaluation have been optimized for DLP and caches use with SIMD extensions and data prefetch. The result of these cumulated optimizations is a speedup that ranges from 1.9 to 7.1. All the benchmarks were originaly computation bound and 7 becomes memory bandwidth bound with the addition of SIMD and data prefetch. Quadrupling the memory bandwidth has no effect on original kernels but improves the performance of SIMD kernels by 15-55%.