Communications of the ACM
Discrete cosine transform: algorithms, advantages, applications
Discrete cosine transform: algorithms, advantages, applications
The limitations to delay-insensitivity in asynchronous circuits
AUSCRYPT '90 Proceedings of the sixth MIT conference on Advanced research in VLSI
Programming in VLSI: from communicating processes to delay-insensitive circuits
Developments in concurrency and communication
Self-timed rings and their application to division
Self-timed rings and their application to division
Four-phase micropipeline latch control circuits
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On the models for asynchronous circuit behaviour with OR causality
Formal Methods in System Design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An asynchronous matrix-vector multiplier for discrete cosine transform
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Very low power pipelines using significance compression
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Transformations for the synthesis and optimization of asynchronous distributed control
Proceedings of the 38th annual Design Automation Conference
Low Power Digital CMOS Design
High-Level Modeling and Design of Asynchronous Interface Logic
IEEE Design & Test
A hybrid asynchronous system design environment
ASYNC '95 Proceedings of the 2nd Working Conference on Asynchronous Design Methodologies
ASYNC '02 Proceedings of the 8th International Symposium on Asynchronus Circuits and Systems
High-Speed QDI Asynchronous Pipelines
ASYNC '02 Proceedings of the 8th International Symposium on Asynchronus Circuits and Systems
Pipelined Asynchronous Circuits
Pipelined Asynchronous Circuits
Width-Adaptive Data Word Architectures
ARVLSI '01 Proceedings of the 2001 Conference on Advanced Research in VLSI
Control Circuit Templates for Asynchronous Bundled-Data Pipelines
Proceedings of the conference on Design, automation and test in Europe
A 100 MHz 2-D 8×8 DCT/IDCT processor for HDTV applications
IEEE Transactions on Circuits and Systems for Video Technology
Efficient performance analysis of asynchronous systems based on periodicity
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Hi-index | 0.00 |
This paper demonstrates the design of efficient asynchronous bundled-data pipelines for the matrix-vector multiplication core of discrete cosine transforms (DCTs). The architecture is optimized for both zero and small-valued data, typical in DCT applications, yielding both high average performance and low average power. The proposed bundled-data pipelines include novel data-dependent delay lines with integrated control circuitry to efficiently implement speculative completion sensing. The control circuits are based on a novel control-circuit template that simplifies the design of such nonlinear pipelines. Extensive post-layout back-end timing analysis was performed to gain confidence in the timing margins as well as to quantify performance and energy. Comparison with a synchronous counterpart suggests that our best asynchronous design yields 30% higher average throughput with negligible energy overhead.