Linear algebra operators for GPU implementation of numerical algorithms
ACM SIGGRAPH 2003 Papers
Error Control Coding, Second Edition
Error Control Coding, Second Edition
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Computation on Programmable Graphics Hardware
IEEE Computer Graphics and Applications
Interconnection framework for high-throughput, flexible LDPC decoders
Proceedings of the conference on Design, automation and test in Europe: Designers' forum
Low cost LDPC decoder for DVB-S2
Proceedings of the conference on Design, automation and test in Europe: Designers' forum
Evolutionary Computing on Consumer Graphics Hardware
IEEE Intelligent Systems
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Massive parallel LDPC decoding on GPU
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Joint (3,k)-regular LDPC code and decoder/encoder design
IEEE Transactions on Signal Processing
Multi-core platforms for signal processing: source and channel coding
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Reconfigurable real-time MIMO detector on GPU
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Design space exploration of the turbo decoding algorithm on GPUs
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Memory Access Optimized Implementation of Cyclic and Quasi-Cyclic LDPC Codes on a GPGPU
Journal of Signal Processing Systems
Implementation of a High Throughput Soft MIMO Detector on GPU
Journal of Signal Processing Systems
Systematic construction and verification methodology for LDPC codes
WASA'11 Proceedings of the 6th international conference on Wireless algorithms, systems, and applications
Complexity analysis of software defined DVB-T2 physical layer
Analog Integrated Circuits and Signal Processing
Implementation of a High Throughput 3GPP Turbo Decoder on GPU
Journal of Signal Processing Systems
Efficient decoding of QC-LDPC codes using GPUs
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
An efficient GPU implementation of fixed-complexity sphere decoders for MIMO wireless systems
Integrated Computer-Aided Engineering
Hi-index | 0.00 |
Due to huge computational requirements, powerful Low-Density Parity-Check (LDPC) error correcting codes, discovered in the early 1960s, have only recently been adopted by emerging communication standards. LDPC decoders are supported by VLSI technology, which delivers good parallel computational power with excellent throughputs, but at the expense of significant costs. In this work, we propose an alternative flexible LDPC decoder that exploits data-parallelism for simultaneous multicodeword decoding, supported by multithreading on CUDA-based graphics processing units (GPUs). The ratio of arithmetic operations per memory access is low for the efficient min-sum LDPC decoding algorithm proposed, which causes a bottleneck due to memory latency and data collisions. We propose runtime data realignment to allow coalesced parallel memory accesses to be performed by distinct threads inside the same warp. The memory access patterns of LDPC codes are random, which does not admit the simultaneous use of coalescence in both read and write operations of the decoding process. To overcome this problem we have developed a data mapping transformation which allows new addresses to be contiguously accessed for one of the mentioned memory access types. Our implementation shows throughputs above 100Mbps and BER curves that compare well with ASIC solutions.