Rotation scheduling: a loop pipelining algorithm
DAC '93 Proceedings of the 30th international Design Automation Conference
Low power storage cycle budget distribution tool support for hierarchical graphs
ISSS '00 Proceedings of the 13th international symposium on System synthesis
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Hi-index | 0.00 |
Block Turbo-codes (BTC) are promising forward error correction (FEC) codes providing close-to-optimal coding gain for rather high coding rate (R 0.7) and less subject to an error floor than Convolution Turbo Codes (CTC). Due to its good convergence properties, the Fang-Buda algorithm (FBA) allows efficiently decoding BTC in far less iterations than traditional soft-decoding algorithms such as Chase's algorithm. Moreover it can handle BTC inner code with higher minimum distance, improving consequently coding performances.However, the FBA data-intensive character and its very complex control structure are dramatic bottlenecks for a low-power, high-throughput implementation. Therefore, currently available BTC decoders are based on some variants of the Chase algorithm and can only handle simple BTC inner codes. In order to enable high performance BTCs without sacrificing throughput or energy, we have systematically analyzed and optimized the FBA algorithm, applying a systematic methodology to improve the data transfer and storage characteristics. This paper details the algorithm transformation steps and the resulting memory architecture. The latter, when mapped in a typical 0.18 μm technology and clocked at 200 MHz, enables BTCs with maximum throughput up to 134 Mbps. The memory power consumption, which is dominant for such a data-dominated application, has been estimated, after optimization, to 16 nJ/bit while the memory area estimation led to 3.5 mm2 per FBA module in the BTC pipeline.