A bitstream reconfigurable FPGA implementation of the WSAT algorithm
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Numerical Algorithms for Modern Parallel Computer Architectures
Numerical Algorithms for Modern Parallel Computer Architectures
Parallel Direct Solution of Linear Equations on FPGA-Based Machines
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters
IEEE Transactions on Parallel and Distributed Systems
Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
64-bit floating-point FPGA matrix multiplication
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Concurrency and Computation: Practice & Experience
Reconfigurable communication networks in a parametric SIMD parallel system on chip
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Hi-index | 0.00 |
Power flow analysis plays an important role in power grid configurations, operating management and contingency analysis. The Newton-Raphson (NR) iterative method is often enlisted for solving power flow analysis problems. However, it involves computation- expensive matrix multiplications (MMs). In this paper we propose an FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA) to speed up MM within each NR iteration. FPGA stands for Field-Programmable Gate Array. HISA is comprised of medium-grain and coarse-grain instructions. The H-SIMD machine also facilitates better mapping of MM onto recent multimillion-gate FPGAs. At each level, any HISA instruction is classified to be of either the communication or computation type. The former are executed by a controller while the latter are issued to lower levels in the hierarchy. Additionally, by using a memory switching scheme and the high-level HISA set to partition applications, the host-FPGA communication overheads can be hidden. Our test results show sustained high performance.