A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA

  • Authors:
  • Jie Zhou;Yong Dou;Jianxun Zhao;Fei Xia;Yuanwu Lei;Yuxing Tang

  • Affiliations:
  • National Laboratory for Parallel & Distributed Processing, NUDT, Changsha, P.R. China 410073;National Laboratory for Parallel & Distributed Processing, NUDT, Changsha, P.R. China 410073;Academy of Armored Forces Engineering, Beijing, China 100072;National Laboratory for Parallel & Distributed Processing, NUDT, Changsha, P.R. China 410073;National Laboratory for Parallel & Distributed Processing, NUDT, Changsha, P.R. China 410073;National Laboratory for Parallel & Distributed Processing, NUDT, Changsha, P.R. China 410073

  • Venue:
  • APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.