Generalization of the Sethi-Ullman algorithm for register allocation
Software—Practice & Experience
Scalable vector media-processors for embedded systems
Scalable vector media-processors for embedded systems
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Scratchpad memory management for portable systems with a memory management unit
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Vector processing as a soft-core CPU accelerator
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
VESPA: portable, scalable, and flexible FPGA-based vector processors
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Vector Processing as a Soft Processor Accelerator
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
FPGA Circuit Synthesis of Accelerator Data-Parallel Programs
FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
VEGAS: soft vector processor with scratchpad memory
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Hi-index | 0.00 |
This paper describes the compiler design for VENICE, a new soft vector processor (SVP). The compiler is a new back-end target for Microsoft Accelerator, a high-level data parallel library for C++ and C#. This allows us to automatically compile high-level programs into VENICE assembly code, thus avoiding the process of writing assembly code used by previous SVPs. Experimental results show the compiler can generate scalable parallel code with execution times that are comparable to hand-written VENICE assembly code. On data-parallel applications, VENICE at 100MHz on an Altera DE3 platform runs at speeds comparable to one core of a 3.5GHz Intel Xeon W3690 processor, beating it in performance on four of six benchmarks by up to 3.2x.