Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
A performance study of general-purpose applications on graphics processors using CUDA
Journal of Parallel and Distributed Computing
Implementation of an SDR platform using GPU and its application to a 2 × 2 MIMO WiMAX system
Analog Integrated Circuits and Signal Processing
Implementation of a High Throughput 3GPP Turbo Decoder on GPU
Journal of Signal Processing Systems
An improved low-power high-throughput log-MAP turbo decoder
IEEE Transactions on Consumer Electronics
Implementation of an SDR system using graphics processing unit
IEEE Communications Magazine
Hi-index | 0.00 |
In this paper, we present an implementation of a long term evolution (LTE) system on a software defined radio (SDR) platform using a conventional personal computer that adopts a graphic processing unit (GPU) and a universal software radio peripheral2 (USRP2) with a URSP hardware driver (UHD) to implement an SDR software modem and a radio frequency transceiver, respectively. The central processing unit executes C++ control code that can access the USRP2 via the UHD. We have adopted the Ettus Research UHD due to its high degree of flexibility in the design of the transceiver chain. By taking advantage of this benefit, a simple cognitive radio engine has been implemented using libraries provided by the UHD. We have implemented the software modem on a GPU that is suitable for parallel computing due to its powerful arithmetic and logic units. A parallel programming method is proposed that exploits the single instruction multiple data architecture of the GPU. We focus on the implementation of the Turbo decoder due to its high computational requirements and difficulty in parallelizing the algorithm. The implemented system is analyzed primarily in terms of computation time using the compute unified device architecture profiler. From our experimental tests using the implemented system, we have measured the total processing time for a single frame of both transmit and receive LTE data. We find that it takes 5.00 and 8.58 ms for transmit and receive, respectively. This confirms that the implemented system is capable of real-time processing of all the baseband signal processing algorithms required for LTE systems.