Digital signal processing (2nd ed.): principles, algorithms, and applications
Digital signal processing (2nd ed.): principles, algorithms, and applications
Adaptive filter theory (3rd ed.)
Adaptive filter theory (3rd ed.)
Matrix computations (3rd ed.)
Multiuser Detection
Digital Speech; Coding for Low Bit Rate Communication Systems
Digital Speech; Coding for Low Bit Rate Communication Systems
Convex Optimization
FPGA based Embedded Processing Architecture for the QRD-RLS Algorithm
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Convergence of nonmonotone line search method
Journal of Computational and Applied Mathematics
Low-Complexity RLS Algorithms Using Dichotomous Coordinate Descent Iterations
IEEE Transactions on Signal Processing - Part II
Identification of active users in synchronous CDMA multiuser detection
IEEE Journal on Selected Areas in Communications
Iterative QR decomposition architecture using the modified gram-schmidt algorithm for MIMO systems
IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special issue on ISCAS 2009
Journal of Signal Processing Systems
Low-complexity adaptive decision-feedback equalization of MIMO channels
Signal Processing
Hi-index | 0.00 |
In the areas of signal processing and communications, such as antenna-array beamforming, adaptive filtering, multiuser and multiple-input-multiple-output (MIMO) detection, channel estimation and equalization, echo and interference cancellation, and others, solving linear systems of equations often provides an optimal performance. However, this is also a very complicated operation that designers try to avoid by proposing different suboptimal techniques. The dichotomous coordinate descent (DCD) algorithm allows linear systems of equations to be solved with high computational efficiency. In this paper, we present architectures and field-programmable gate-array (FPGA) designs of two variants of the DCD algorithm, which are known as cyclic and leading DCD algorithms. For each of these techniques, we present serial designs, group-2 and group-4 designs, as well as a design with parallel update of the residual vector for the cyclic DCD algorithm. These designs have different degrees of parallelism, thus enabling a tradeoff between FPGA resources and computation time. The serial designs require the smallest FPGA resources; they are well suited for applications where many parallel solvers are required, e.g., for detection in MIMO-orthogonal-frequency-division-multiplexing communication systems. The parallelism introduced in the proposed group-2 and group-4 designs allows faster convergence to the true solution at the expense of an increase in FPGA resources. The design with parallel update of the residual vector provides the fastest convergence speed; however, if the system size is high, it may result in a significant increase in FPGA resources. The proposed fixed-point designs provide an accuracy performance that is very close to the performance of floating-point counterparts and require significantly lower FPGA resources than techniques based on QR decomposition.