High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware

Authors:
Ling Zhuo;Viktor K. Prasanna
Affiliations:
University of Southern California, Los Angeles;University of Southern California, Los Angeles
Venue:
IEEE Transactions on Computers
Year:
2008

Citing 0
Cited 13

State-of-the-art in heterogeneous computing

Scientific Programming
FPGA accelerating double/quad-double high precision floating-point applications for ExaScale computing

Proceedings of the 24th ACM International Conference on Supercomputing
Resource-constrained multiprocessor synthesis for floating-point applications on FPGAs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Peak performance model for a custom precision floating-point dot product on FPGAs

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Portable and scalable FPGA-based acceleration of a direct linear system solver

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Formal approach for the development of intelligent industrial control components

International Journal of Computer Applications in Technology
A reconstruction method for electrical capacitance tomography based on image fusion techniques

Digital Signal Processing
Self-Alignment Schemes for the Implementation of Addition-Related Floating-Point Operators

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Floating-Point Exponentiation Units for Reconfigurable Computing

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Performance modeling of pipelined linear algebra architectures on FPGAs

ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications
C2FPGA-A dependency-timing graph design methodology

Journal of Parallel and Distributed Computing
Scalable matrix decompositions with multiple cores on FPGAs

Microprocessors & Microsystems
Hardware---software optimizations of reconfigurable multi-core processors for floating-point computations of large sparse matrices

Journal of Real-Time Image Processing

Quantified Score

Hi-index	14.99

Visualization

Abstract

Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such operations have been extensively investigated. With the rapid advances in technology, hardware acceleration of linear algebra applications using FPGAs (Field Programmable Gate Arrays) has become feasible. In this paper, we propose FPGA-based designs for several basic linear algebra operations, including dot product, matrix-vector multiplication, matrix multiplication and matrix factorization. By identifying the parameters for each operation, we analyze the trade-offs and propose a high-performance design. In the implementations of the designs, the values of the parameters are determined according to the hardware constraints, such as the available chip area, the size of available memory, the memory bandwidth, and the number of I/O pins. The proposed designs are implemented on Xilinx Virtex-II Pro FPGAs. Experimental results show that our designs scale with the available hardware resources. Also, the performance of our designs compares favorably with that of general-purpose processor based designs. We also show that with faster floating-point units and larger devices, the performance of our designs increases accordingly.