-D Wavelet Transform Enhancement on General-Purpose Microprocessors: Memory Hierarchy and SIMD Parallelism Exploitation

Authors:
Daniel Chaver;Christian Tenllado;Luis Piñuel;Manuel Prieto;Francisco Tirado
Affiliations:
-;-;-;-;-
Venue:
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Year:
2002

Citing 9
Cited 5

Parallelizing the fast wavelet transform

Parallel Computing
Coarse-Grained Parallel Algorithms for Multi-DimensionalWavelet Transforms

The Journal of Supercomputing
Wavelets for computer graphics: theory and applications

Wavelets for computer graphics: theory and applications
Nonlinear array layouts for hierarchical memory systems

ICS '99 Proceedings of the 13th international conference on Supercomputing
Efficient realizations of encoders and decoders based on the 2-D discrete wavelet transform

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Parallel Wavelet Transform for Large Scale Image Processing

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Multicomputer Algorithms for Wavelet Packet Image Decomposition

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Wavelet transform for large scale image processing on modern microprocessors

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Line-based, reduced memory, wavelet image compression

IEEE Transactions on Image Processing

Improving the memory behavior of vertical filtering in the discrete wavelet transform

Proceedings of the 3rd conference on Computing frontiers
Guided Prefetching Based on Runtime Access Patterns

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Comprehensive cache performance tuning with a toolset

Future Generation Computer Systems
Exploiting multilevel parallelism within modern microprocessors: DWT as a case study

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Algorithms and architectures for 2D discrete wavelet transform

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the implementation of a 2-D Discrete Wavelet Transform on general-purpose microprocessors, focusing on both memory hierarchy and SIMD parallelization issues. Both topics are somewhat related, since SIMD extensions are only useful if the memory hierarchy is efficiently exploited. In this work, locality has been significantly improved by means of a novel approach called pipelined computation, which complements previous techniques based on loop tiling and non-linear layouts. As experimental platforms we have employed a Pentium-III (P-III) and a Pentium-4 (P-4) microprocessor. However, our SIMD-oriented tuning has been exclusively performed at source code level. Basically, we have reordered some loops and introduced some modifications that allow automatic vectorization. Taking into account the abstraction level at which the optimizations are carried out, the speedups obtained on the investigated platforms are quite satisfactory, even though further improvement can be obtained by dropping the level of abstraction (compiler intrinsics or assembly code).