Parallel LDPC Decoding on the Cell/B.E. Processor

  • Authors:
  • Gabriel Falcão;Leonel Sousa;Vitor Silva;José Marinho

  • Affiliations:
  • Instituto de Telecomunicações, University of Coimbra, Coimbra, Portugal 3030---290;INESC-ID/IST, Technical University of Lisbon, Lisboa, Portugal 1000---129;Instituto de Telecomunicações, University of Coimbra, Coimbra, Portugal 3030---290;Instituto de Telecomunicações, University of Coimbra, Coimbra, Portugal 3030---290

  • Venue:
  • HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Low-Density Parity-Check (LDPC) codes are among the best error correcting codes known and have been recently adopted by data transmission standards, such as the second generation for Satellite Digital Video Broadcasting (DVB-S2) and WiMAX. LDPC codes are based on sparse parity-check matrices and use message-passing algorithms, also known as belief propagation, which demands very intensive computation. For that reason, VLSI dedicated architectures have been proposed in the past few years, to achieve real-time processing. This paper proposes a new flexible and programmable approach for LDPC decoding on a heterogeneous multicore Cell Broadband Engine (Cell/B.E.) architecture. Very compact data structures were developed to represent the bipartite graph for both regular and irregular LDPC codes. They are used to map the irregular behavior of the Sum-Product Algorithm (SPA) used in LDPC decoding into a computing model that expresses parallelism and locality of data by decoupling computation and memory accesses. This model can be used in general for exploiting capabilities of modern multicore architectures. For the Cell/B.E., in particular, stream-based programs were developed for simultaneous multicodeword LDPC decoding by using SIMD features and a low-latency DMA-based data communication mechanism between processors. Experimental results show significant throughputs that compare well with state-of-the-art VLSI-based solutions.