Design of a high-throughput low-power IS95 Viterbi decoder

Authors:
Xun Liu;Marios C. Papaefthymiou
Affiliations:
University of Michigan, MI;University of Michigan, MI
Venue:
Proceedings of the 39th annual Design Automation Conference
Year:
2002

Citing 8
Cited 0

Memory bank and register allocation in software synthesis for ASIPs

ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
Design of an ASIP architecture for low-level visual elaborations

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Realization of a programmable parallel DSP for high performance image processing applications

DAC '98 Proceedings of the 35th annual Design Automation Conference
Minimizing the required memory bandwidth in VLSI system realizations

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Optimizing Power in ASIC Behavioral Synthesis

IEEE Design & Test
A Partitioning Programming Environment for a Novel Parallel Architecture

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Simultaneous Scheduling, Binding and Floorplanning for Interconnect Power Optimization

VLSID '99 Proceedings of the 12th International Conference on VLSI Design - 'VLSI for the Information Appliance'
High-level DSP synthesis using concurrent transformations, scheduling, and allocation

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data-transfer oriented design methodology to implement a low-power 256-state rate-1/3 IS95 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the RTL implementation of the individual processors, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 &mgr; standard cell library, our decoder achieves a throughput of 20 Mbps in simulation and dissipates only 450 mW.