VEGAS: soft vector processor with scratchpad memory

Authors:
Christopher H. Chou;Aaron Severance;Alex D. Brant;Zhiduo Liu;Saurabh Sant;Guy G.F. Lemieux
Affiliations:
The University of British Columbia, Vancouver, BC, Canada;The University of British Columbia, Vancouver, BC, Canada;The University of British Columbia, Vancouver, BC, Canada;The University of British Columbia, Vancouver, BC, Canada;The University of British Columbia, Vancouver, BC, Canada;The University of British Columbia, Vancouver, BC, Canada
Venue:
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Year:
2011

Citing 15
Cited 6

On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Vector microprocessors

Vector microprocessors
Scalable vector media-processors for embedded systems

Scalable vector media-processors for embedded systems
SODA: A Low-power Architecture For Software Radio

Proceedings of the 33rd annual international symposium on Computer Architecture
A Multithreaded Soft Processor for SoPC Area Reduction

FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Register pointer architecture for efficient embedded processors

Proceedings of the conference on Design, automation and test in Europe
Vector processing as a soft-core CPU accelerator

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Scalable Vector Processors for Embedded Systems

IEEE Micro
VESPA: portable, scalable, and flexible FPGA-based vector processors

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
From SODA to scotch: The evolution of a wireless baseband processor

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Vector Processing as a Soft Processor Accelerator

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Fine-grain performance scaling of soft vector processors

CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

OCTAVO: an FPGA-centric processor family

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Accelerator compiler for the VENICE vector processor

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Fast shared on-chip memory architecture for efficient hybrid computing with CGRAs

Proceedings of the Conference on Design, Automation and Test in Europe
Multicore-based vector coprocessor sharing for performance and energy gains

ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Soft vector processors with streaming pipelines

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Embedded supercomputing in FPGAs with the VectorBlox MXP matrix processor

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents VEGAS, a new soft vector architecture, in which the vector processor reads and writes directly to a scratchpad memory instead of a vector register file. The scratchpad memory is a more efficient storage medium than a vector register file, allowing up to 9x more data elements to fit into on-chip memory. In addition, the use of fracturable ALUs in VEGAS allow efficient processing of bytes, halfwords and words in the same processor instance, providing up to 4x the operations compared to existing fixed-width soft vector ALUs. Benchmarks show the new VEGAS architecture is 10x to 208x faster than Nios II and has 1.7x to 3.1x better area-delay product than previous vector work, achieving much higher throughput per unit area. To put this performance in perspective, VEGAS is faster than a leading-edge Intel processor at integer matrix multiply. To ease programming effort and provide full debug support, VEGAS uses a C macro API that outputs vector instructions as standard NIOS II/f custom instructions.