On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Communications of the ACM - Special issue on computer architecture
An optimal memory allocation scheme for scratch-pad-based embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Vector microprocessors
Scalable vector media-processors for embedded systems
Scalable vector media-processors for embedded systems
SODA: A Low-power Architecture For Software Radio
Proceedings of the 33rd annual international symposium on Computer Architecture
A Multithreaded Soft Processor for SoPC Area Reduction
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Register pointer architecture for efficient embedded processors
Proceedings of the conference on Design, automation and test in Europe
Vector processing as a soft-core CPU accelerator
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
VESPA: portable, scalable, and flexible FPGA-based vector processors
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
From SODA to scotch: The evolution of a wireless baseband processor
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Vector Processing as a Soft Processor Accelerator
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Fine-grain performance scaling of soft vector processors
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
OCTAVO: an FPGA-centric processor family
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Accelerator compiler for the VENICE vector processor
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Fast shared on-chip memory architecture for efficient hybrid computing with CGRAs
Proceedings of the Conference on Design, Automation and Test in Europe
Multicore-based vector coprocessor sharing for performance and energy gains
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Soft vector processors with streaming pipelines
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Embedded supercomputing in FPGAs with the VectorBlox MXP matrix processor
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Hi-index | 0.00 |
This paper presents VEGAS, a new soft vector architecture, in which the vector processor reads and writes directly to a scratchpad memory instead of a vector register file. The scratchpad memory is a more efficient storage medium than a vector register file, allowing up to 9x more data elements to fit into on-chip memory. In addition, the use of fracturable ALUs in VEGAS allow efficient processing of bytes, halfwords and words in the same processor instance, providing up to 4x the operations compared to existing fixed-width soft vector ALUs. Benchmarks show the new VEGAS architecture is 10x to 208x faster than Nios II and has 1.7x to 3.1x better area-delay product than previous vector work, achieving much higher throughput per unit area. To put this performance in perspective, VEGAS is faster than a leading-edge Intel processor at integer matrix multiply. To ease programming effort and provide full debug support, VEGAS uses a C macro API that outputs vector instructions as standard NIOS II/f custom instructions.