Design of a 20-Mb/s 256-state viterbi decoder

Authors:
Xun Liu;Marios C. Papaefthymiou
Affiliations:
Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC and Advanced Computer Architecture Laboratory, Department of Electrical Engineering and Computer Sci ...;Advanced Computer Architecture Laboratory, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2003

Citing 12
Cited 4

Memory bank and register allocation in software synthesis for ASIPs

ICCAD '95 Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design
Design of an ASIP architecture for low-level visual elaborations

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Realization of a programmable parallel DSP for high performance image processing applications

DAC '98 Proceedings of the 35th annual Design Automation Conference
Minimizing the required memory bandwidth in VLSI system realizations

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Optimizing Power in ASIC Behavioral Synthesis

IEEE Design & Test
A Partitioning Programming Environment for a Novel Parallel Architecture

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A 550 Mb/s Radix-4 Bit-level Pipelined 16-State 0.25-µm CMOS Viterbi Decoder

ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Simultaneous Scheduling, Binding and Floorplanning for Interconnect Power Optimization

VLSID '99 Proceedings of the 12th International Conference on VLSI Design - 'VLSI for the Information Appliance'
A versatile architecture for VLSI implementation of the Viterbi algorithm

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
General in-place scheduling for the Viterbi algorithm

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
IC design of an adaptive Viterbi decoder

IEEE Transactions on Consumer Electronics
High-level DSP synthesis using concurrent transformations, scheduling, and allocation

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Compatibility path based binding algorithm for interconnect reduction in high level synthesis

Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
A global interconnect reduction technique during high level synthesis

Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Parallel high-throughput limited search trellis decoder VLSI design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Multiple LDPC-Encoder Layered Space-Time-Frequency Architectures for OFDM MIMO Multiplexing

Wireless Personal Communications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 µm standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W.