IEEE Transactions on Computers
A bitstream reconfigurable FPGA implementation of the WSAT algorithm
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Numerical Algorithms for Modern Parallel Computer Architectures
Numerical Algorithms for Modern Parallel Computer Architectures
Reconfigurable Computing for Digital Signal Processing: A Survey
Journal of VLSI Signal Processing Systems
A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters
IEEE Transactions on Parallel and Distributed Systems
Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
64-bit floating-point FPGA matrix multiplication
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
A matrix product accelerator for field programmable systems on chip
Microprocessors & Microsystems
Hi-index | 0.00 |
FPGAs (Field-Programmable Gate Arrays) are often used as coprocessors to boost the performance of dataintensive applications [1, 2]. However, mapping algorithms onto multimillion-gate FPGAs is time consuming and remains a challenge in configurable system design. The communication overhead between the host workstation and the FPGAs is also significant. To address these problems, we propose in this paper the FPGA-based Hierarchical- SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA). At each level, HISA instructions are classified into communication instructions or computation instructions. The former are executed by the local controller while the latter are issued to the lower level for execution. Additionally, by using a memory switching scheme and the high-level HISA set to partition the application into coarse-grain tasks, the host- FPGA communication overhead can be hidden. We enlist matrix multiplication (MM) to test the effectiveness of HSIMD. The test results show sustained high performance.