Energy Efficient Memory Architecture for High Speed Decoding of Block Turbo-Codes with the Fang-Buda Algorithm

Authors:
B. Bougard;M. Rullmann;E. Brockmeyer;L. Van Der Perre;F. Catthoor;W. Dehaene
Affiliations:
Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium;Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium;Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium;Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium;Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium;Interuniversity Microelectronics Center (IMEC), Kapeldreef 75, B-3001 Leuven, Belgium
Venue:
Journal of VLSI Signal Processing Systems
Year:
2005

Citing 3
Cited 0

Rotation scheduling: a loop pipelining algorithm

DAC '93 Proceedings of the 30th international Design Automation Conference
Low power storage cycle budget distribution tool support for hierarchical graphs

ISSS '00 Proceedings of the 13th international symposium on System synthesis
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design

Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

Block Turbo-codes (BTC) are promising forward error correction (FEC) codes providing close-to-optimal coding gain for rather high coding rate (R 0.7) and less subject to an error floor than Convolution Turbo Codes (CTC). Due to its good convergence properties, the Fang-Buda algorithm (FBA) allows efficiently decoding BTC in far less iterations than traditional soft-decoding algorithms such as Chase's algorithm. Moreover it can handle BTC inner code with higher minimum distance, improving consequently coding performances.However, the FBA data-intensive character and its very complex control structure are dramatic bottlenecks for a low-power, high-throughput implementation. Therefore, currently available BTC decoders are based on some variants of the Chase algorithm and can only handle simple BTC inner codes. In order to enable high performance BTCs without sacrificing throughput or energy, we have systematically analyzed and optimized the FBA algorithm, applying a systematic methodology to improve the data transfer and storage characteristics. This paper details the algorithm transformation steps and the resulting memory architecture. The latter, when mapped in a typical 0.18 μm technology and clocked at 200 MHz, enables BTCs with maximum throughput up to 134 Mbps. The memory power consumption, which is dominant for such a data-dominated application, has been estimated, after optimization, to 16 nJ/bit while the memory area estimation led to 3.5 mm2 per FBA module in the BTC pipeline.