IEEE Transactions on Computers
VLSI Architecture for Matrix Inversion using Modified Gram-Schmidt based QR Decomposition
VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
Relaxed K-best MIMO signal detector design and VLSI implementation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A new signal detection method for spatially multiplexed MIMO systems and its VLSI implementation
IEEE Transactions on Circuits and Systems II: Express Briefs
Space-time-coded MIMO ZP-OFDM systems: semiblind channel estimation and equalization
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Probabilistic spherical detection and VLSI implementation for multiple-antenna systems
IEEE Transactions on Circuits and Systems Part I: Regular Papers
IEEE Transactions on Wireless Communications
A flexible DSP architecture for MIMO sphere decoding
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Architecture and FPGA design of dichotomous coordinate descent algorithms
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Implementation of synchronization for 2x2 MIMO WLAN system
IEEE Transactions on Consumer Electronics
Algorithm and implementation of the K-best sphere decoding for MIMO detection
IEEE Journal on Selected Areas in Communications
FPGA Implementation of an Iterative Receiver for MIMO-OFDM Systems
IEEE Journal on Selected Areas in Communications
Computers & Mathematics with Applications
Hi-index | 0.00 |
Implementation of an iterative QR decomposition (QRD) (IQRD) architecture based on the modified Gram-Schmidt (MGS) algorithm is proposed in this paper. A QRD is extensively adopted by the detection of multiple-input-multiple-output systems. In order to achieve computational efficiency with robust numerical stability, a triangular systolic array (TSA) for QRD of large-size matrices is presented. In addition, the TSA architecture can be modified into an iterative architecture that is called IQRD for reducing hardware cost. The IQRD hardware is constructed by the diagonal and the triangular process with fewer gate counts and lower power consumption than TSAQRD. For a 4 × 4 matrix, the hardware area of the proposed IQRD can reduce about 41% of the gate counts in TSAQRD. For a generic square matrix of order IQRD, the latency required is 2m - 1 time units, which is based on the MGS algorithm. Thus, the total clock latency is only 10m - 5 cycles.