A Systolic Architecture for Fast Dense Matrix Inversion
IEEE Transactions on Computers
Algorithms for Statistical Signal Processing
Algorithms for Statistical Signal Processing
Design of a parameterizable Silicon intellectual property core for QR-based RLS filtering
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An Introduction to Parallel and Vector Scientific Computation (Cambridge Texts in Applied Mathematics)
VLSI Architecture for Matrix Inversion using Modified Gram-Schmidt based QR Decomposition
VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
50 years of CORDIC: algorithms, architectures, and applications
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Double-precision Gauss-Jordan Algorithm with Partial Pivoting on FPGAs
DSD '09 Proceedings of the 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools
Hi-index | 0.01 |
In this paper, VLSI array architectures for matrix inversion are studied. A new binary-coded z-path (Bi-z) CORDIC is developed and implemented to compute the operations required in the matrix inversion using the Givens rotation (GR) based QR decomposition. The Bi-z CORDIC allows both the GR vectoring and rotation mode, as well as division and multiplication to be executed in a single unified processing element (PE). Hence, a 2D (2 dimensional) array consisting of PEs with different functionalities can be folded into a 1D array to reduce hardware complexity. The Bi-z CORDIC also eliminates the arithmetic complexity of the angle quantization and formation computation that exist in the traditional CORDIC. Two mapping techniques, namely a linear mapping method and an interlaced mapping method, for mapping a 2D matrix inversion array into a 1D array are proposed and developed. Consequently two corresponding array architectures are designed and implemented. Both the architectures use the Bi-z CORDIC in their PEs and they are designed to be fully scalable and parameterizable in terms of matrix size and data wordlength. The linear mapping method is a straightforward mapping offering simple schedule and control signals. The interlaced mapping method has a more complicated schedule with complex control signals but achieves 100% or near 100% processor utilization for odd and even size matrix, respectively.