A Parallel Architecture for the 2-D Discrete Wavelet Transform with Integer Lifting Scheme
Journal of VLSI Signal Processing Systems - Parallel VLSI architectures for image and video processing
Journal of VLSI Signal Processing Systems
Implementing a new architecture of wavelet packet transform on FPGA
AMTA'07 Proceedings of the 8th WSEAS international conference on Acoustics & music: theory & applications
Low-power and high-performance 2-D DWT and IDWT architectures based on 4-tap Daubechies filters
MUSP'07 Proceedings of the 7th WSEAS International Conference on Multimedia Systems & Signal Processing
A pipeline VLSI architecture for high-speed computation of the 1-D discrete wavelet transform
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Hi-index | 35.68 |
We propose a modified systolic architecture that implements the 1-D discrete wavelet transform (DWT) on the basis of the recursive pyramid algorithm (RPA) while correctly managing the border problem and obtaining perfect reconstruction. All the architectures so far described in the literature do not explicitly address the handling of borders and usually assume a zero-padding extension. The RPA makes heavy use of this assumption, which produces very efficient systolic architectures. It is, however, well known that the zero-padding extension does not allow perfect recovery of the original signal, if we compute exactly N coefficients. More coefficients are required either in the forward or in the inverse transform, and their production within the RPA scheme lowers the efficiency of the systolic architecture to a minimum. We introduce a modified RPA working on an extended transform with a compressed schedule that restores good values of efficiency (EC-RPA). We also propose a second version of RPA based on a periodic extension of the signal and of the transform (PE-RPA) that achieves perfect reconstruction by computing exactly N DWT coefficients. A reduced version of the second algorithm (R-PE-RPA) is tailored to the efficient computations of a fixed number of levels in the transform. The VLSI complexity of the management of borders is analyzed through synthesis from a VHDL description of the three algorithms. Synthesis results show that the controller required for PE-RPA can be as large as a ten-tap array. R-PE-RPA instead requires half this area. In both cases, the controller complexity scales sublinearly with the length of the input