Exploiting multilevel parallelism within modern microprocessors: DWT as a case study

  • Authors:
  • C. Tenllado;C. Garcia;M. Prieto;L. Piñuel;F. Tirado

  • Affiliations:
  • Departamento de Arquitectura de Computadores y Automática, Universidad Complutense, Madrid, Spain;Departamento de Arquitectura de Computadores y Automática, Universidad Complutense, Madrid, Spain;Departamento de Arquitectura de Computadores y Automática, Universidad Complutense, Madrid, Spain;Departamento de Arquitectura de Computadores y Automática, Universidad Complutense, Madrid, Spain;Departamento de Arquitectura de Computadores y Automática, Universidad Complutense, Madrid, Spain

  • Venue:
  • VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Simultaneous multithreading (SMT) is being incorporated into modern superscalar microprocessors, allowing several independent threads to issue instructions to the functional units in a single cycle. Effective use of the SMT can hide the inefficiencies caused by long operation latencies, thereby yielding a better utilization of the processor's resources. In this paper we explore techniques to efficiently exploit this capability and its interaction with short-vector processing. We put special emphasis on the differences in algorithm tuning between SMT architectures and shared memory symmetric multiprocessors. As a case study we have chosen the well known Discrete Wavelet Transform (DWT), a central-piece in some image and video coding standards such as MPEG-4 or JPEG-2000.