The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
An FPGA-based VLIW processor with custom hardware execution
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
A Multithreaded Soft Processor for SoPC Area Reduction
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Supporting multithreading in configurable soft processor cores
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
VESPA: portable, scalable, and flexible FPGA-based vector processors
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Vector Processing as a Soft Processor Accelerator
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
VEGAS: soft vector processor with scratchpad memory
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Area-efficient near-associative memories on FPGAs
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Hi-index | 0.01 |
Overlay processor architectures allow FPGAs to be programmed by non-experts using software, but prior designs have mainly been based on the architecture of their ASIC predecessors. In this paper we develop a new processor architecture that from the beginning accounts for and exploits the predefined widths, depths, maximum operating frequencies, and other discretizations and limits of the underlying FPGA components. The result is Octavo, a ten-pipeline-stage eight-threaded processor that operates at the block RAM maximum of 550MHz on a Stratix IV FPGA. Octavo is highly parameterized, allowing us to explore trade-offs in datapath and memory width, memory depth, and number of supported thread contexts.