Compiled multithreaded data paths on FPGAs for dynamic workloads

Authors:
Robert J. Halstead;Walid Najjar
Affiliations:
University of California, Riverside, Riverside, California;University of California, Riverside, Riverside, California
Venue:
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Year:
2013

Citing 16
Cited 0

The horizon supercomputing system: architecture and software

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Exploiting heterogeneous parallelism on a multithreaded multiprocessor

ICS '92 Proceedings of the 6th international conference on Supercomputing
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Multi-processor performance on the Tera MTA

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Sparse Matrix-Vector multiplication on FPGAs

Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Floating-point sparse matrix-vector multiply for FPGAs

Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
ELDORADO

Proceedings of the 2nd conference on Computing frontiers
Sparse Matrix-Vector Multiplication Design on FPGAs

FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Parallel Computing
Streaming Reduction Circuit

DSD '09 Proceedings of the 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools
Instruction Set Innovations for the Convey HC-1 Computer

IEEE Micro
Designing Modular Hardware Accelerators in C with ROCCC 2.0

FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
FPGA and GPU implementation of large scale SpMV

SASP '10 Proceedings of the 2010 IEEE 8th Symposium on Application Specific Processors (SASP)
A Sparse Matrix Personality for the Convey HC-1

FCCM '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
Exploring irregular memory accesses on FPGAs

Proceedings of the first workshop on Irregular applications: architectures and algorithm
Automatically tuning sparse matrix-vector multiplication for GPU architectures

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hardware supported multithreading can mask memory latency by switching the execution to ready threads, which is particularly effective on irregular applications. FPGAs provide an opportunity to have multithreaded data paths customized toeach individual application. In this paper we describe the compiler generation of these hardware structures from a C subset targeting a Convey HC-2ex machine. We describe how this compilation approach differs from other C to HDL compilers. We use the compiler to generate a multithreaded sparse matrix vector multiplication kernel and compare its performance to existing FPGA, and highly optimized software implementations.