Fast parallel algorithms for short-range molecular dynamics
Journal of Computational Physics
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
FPGAs vs. CPUs: trends in peak floating-point performance
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Sparse Matrix-Vector multiplication on FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Floating-point sparse matrix-vector multiply for FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
64-bit floating-point FPGA matrix multiplication
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Design Tradeoffs for BLAS Operations on Reconfigurable Hardware
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Metropolitan Road Traffic Simulation on FPGAs
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
An Analysis of the Double-Precision Floating-Point FFT on FPGAs
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
High Performance Linear Algebra Operations on Reconfigurable Systems
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Hardware/Software Approach to Molecular Dynamics on Reconfigurable Computers
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Experiences with a FPGA-based Reed/Solomon-encoding coprocessor
Microprocessors & Microsystems
Parallel backprojection: a case study in high-performance reconfigurable computing
EURASIP Journal on Embedded Systems - FPGA supercomputing platforms, architectures, and techniques for accelerating computationally complex algorithms
From Silicon to Science: The Long Road to Production Reconfigurable Supercomputing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A packet-switched network architecture for reconfigurable computing
ACM Transactions on Embedded Computing Systems (TECS)
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Hardware implementation of the orbital function for quantum chemistry calculations
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Hi-index | 0.00 |
Reconfigurable computing leveraging field programmable gate arrays (FPGAs) is one of many accelerator technologies that are being investigated for application to high performance computing (HPC). Like most accelerators, FPGAs are very efficient at both dense matrix multiplication and FFT computations, but two important aspects of how to deliver that performance to applications have received too little attention. First, the standard API for important compute kernels hides parallelism from the system. Second, the issue of system architecture is virtually never addressed. This paper explores both issues and their implications for applications. We find that high bandwidth, low latency connectivity can be important, but the right API can be even more important.