Improving superword level parallelism support in modern compilers

  • Authors:
  • Christian Tenllado;Luis Piñuel;Manuel Prieto;Francisco Tirado;F. Catthoor

  • Affiliations:
  • Universidad Complutense, Madrid, Spain;Universidad Complutense, Madrid, Spain;Universidad Complutense, Madrid, Spain;Universidad Complutense, Madrid, Spain;Interuniversity MicroElectronic Center (IMEC), Leuven, Belgium

  • Venue:
  • CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multimedia vector instruction sets are becoming ubiquitous in most of the embedded systems used for multimedia, networking and communications. However, current compiler technology do not allow for an efficient exploitation of the inherent data parallelism available in many signal processing and multimedia applications. In this paper, we have explored the automatic vectorization of embedded applications. In particular, we have focused on algorithms in which the same computations are applied over a set of signals that are being processed simultaneously. Usually this set of signals is represented as a 2D array in which each row is an input signal that has to be filtered in some way. A motivating example, inspired by VoIP processing, illustrates that state-of-the-art vectorizing compilers inefficiently exploit the data parallelism inherent to this kind of applications. One of the main reasons behind this, is that they present inner loops that carry all the dependencies and external loops with strided memory accesses.We propose a modification of the Superword Level Parallelism (SLP) compiler, proposed in [9], that tries to overcome these problems. Experimental results show that our approach clearly outperforms commercial compilers.