JPEG2000: the new still picture compression standard
MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
A Programmable Parallel VLSI Architecture for 2-D Discrete Wavelet Transform
Journal of VLSI Signal Processing Systems - Parallel VLSI architectures for image and video processing
Subthreshold leakage modeling and reduction techniques
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Energy efficient CMOS microprocessor design
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Theoretical and practical limits of dynamic voltage scaling
Proceedings of the 41st annual Design Automation Conference
A variation tolerant subthreshold design approach
Proceedings of the 42nd annual Design Automation Conference
Energy Optimization of Subthreshold-Voltage Sensor Network Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Subthreshold logical effort: a systematic framework for optimal subthreshold device sizing
Proceedings of the 43rd annual Design Automation Conference
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Utilizing reverse short channel effect for optimal subthreshold circuit design
Proceedings of the 2006 international symposium on Low power electronics and design
Nanometer device scaling in subthreshold circuits
Proceedings of the 44th annual Design Automation Conference
Energy efficient near-threshold chip multi-processing
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
An Energy Efficient Parallel Architecture Using Near Threshold Operation
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Ultra-low-power DLMS adaptive filter for hearing aid applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
The Discrete Wavelet Transform (DWT) is a means to analyze the frequency content of a signal and has extensive uses, including the JPEG2000 codec. Many portable and battery operated applications of DWT are expected in the near future that require a low power implementation of this transform. In this paper, a parallel VLSI implementation of a 2D lifting-based DWT processor is presented that is scalable from 2 to 256 parallel units. This design benefits from an efficient data distribution module to the parallel units, which constitutes a small overhead, and is able to significantly benefit from voltage scaling to achieve energy efficiency. In our design, the number of parallel units is increased and their speed is reduced through voltage scaling, while maintaining a constant throughput. Our results show that the optimal operating voltage of the parallel units, for a target throughput of 200MHz,4 is 386mV. This is below the threshold voltage, which is the voltage that turns the transistors on. Since operating a circuit in subthreshold voltage consumes 100+ times less power than running it at nominal voltage, our design is able to provide the same throughput as a reference pipelined implementation with 26 times less power consumption.