Efficient array architectures for multi-dimensional lifting-based discrete wavelet transforms

  • Authors:
  • Cheng-yi Xiong;Jian-hua Hou;Jin-wen Tian;Jian Liu

  • Affiliations:
  • College of Electronic Information Engineering, South-Center University for Nationalities, Wuhan 430074, China and Institute of Pattern Recognition and Artificial Intelligence, Huazhong University ...;College of Electronic Information Engineering, South-Center University for Nationalities, Wuhan 430074, China and Institute of Pattern Recognition and Artificial Intelligence, Huazhong University ...;Institute of Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China;Institute of Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China

  • Venue:
  • Signal Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.08

Visualization

Abstract

Efficient array architectures for multi-dimensional (m-D) discrete wavelet transform (DWT), e.g. m=2,3, are presented, in which the lifting scheme of DWT is used to reduce efficiently hardware complexity. The parallelism of 2^m subbands transforms in lifting-based m-D DWT is explored, which increases efficiently the throughput rate of separable m-D DWT with fewer additional hardware overhead. The proposed architecture is composed of m2^m^-^1 1-D DWT modules working in parallel and pipelined, which is designed to process 2^m input samples per clock cycle, and generate 2^m subbands coefficients synchronously. The total time of achieving one level of decomposition for a 2-D image of size N^2 is approximately N^2/4 intra-clock cycles (ccs), and that for a 3-D image sequence of size MN^2 is approximately MN^2/8ccs. Efficient line-based architecture frameworks for both 2D+t (spatial domain decomposition first, followed by temporal directional decomposition) and t+2D (temporal directional decomposition first, followed by spatial domain decomposition) 3-D DWT are firstly proposed, as much as we know. Compared with the similar works reported in previous literature, the proposed architectures have good performance in terms of throughput rate and system output latency, and are good alternatives in tradeoff between throughput rate and hardware complexity. The proposed architectures are simple, regular, scalable and well suited for VLSI implementation.