A Parallel Architecture for the 2-D Discrete Wavelet Transform with Integer Lifting Scheme
Journal of VLSI Signal Processing Systems - Parallel VLSI architectures for image and video processing
A VLSI architecture for lifting-based forward and inverse wavelettransform
IEEE Transactions on Signal Processing
Efficient architectures for 1-D and 2-D lifting-based wavelet transforms
IEEE Transactions on Signal Processing
Flipping structure: an efficient VLSI architecture for lifting-based discrete wavelet transform
IEEE Transactions on Signal Processing
IEEE Transactions on Consumer Electronics
Efficient Architectures for Two-Dimensional Discrete Wavelet Transform Using Lifting Scheme
IEEE Transactions on Image Processing
IEEE Transactions on Circuits and Systems for Video Technology
On the Design of Fast Wavelet Transform Algorithms With Low Memory Requirements
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.01 |
The lifting scheme reduces the computational complexity for computing Discrete Wavelet Transform (DWT) compared to convolution. 2-D DWT is widely used frequency domain transform for various multimedia applications. Due to battery operated handheld devices for multimedia application need is arise to design low power yet high speed and area efficient chip for 2-D DWT. We have proposed a high performance and memory efficient architecture with parallel scanning method for 2- D DWT using 5/3 Lifting wavelet and done chip level implementation using 180nm UMC standard cell library. This architecture is composed with two 1-D DWT units and a Transpose Unit (TU). Proposed parallel scanning reduces not only of on-chip line buffer but enhances through put as well compared to other line based scanning. Proposed 2-D DWT architecture utilizes only 2N size buffer for NxN sized image, which is low compare to 3.5N usual requirement for to implement 5/3 Lifting wavelet. Designed TU operates at half clock rate which reduces power and its design is independent of size of input image. Instead of shifter we propose Hardwired Scaling Unit (HSU) for coefficient multiplication in order to save dynamic power. This architecture is first synthesized using Xilinx ISE 10.1 and is implemented on Virtex-IIPRO XC2VP30 FPGA and then compile RTL with UMC 180 nm standard cell library for ASIC (Application Specific Integrated Circuit) implementation. This design is compared for power, speed and area with existed architectures.