MIPS RISC architectures
A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
ConCISe: A Compiler-Driven CPLD-Based Instruction Set Accelerator
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A quantitative analysis of the speedup factors of FPGAs over processors
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Characterizing embedded applications for instruction-set extensible processors
Proceedings of the 41st annual Design Automation Conference
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Instruction set extension with shadow registers for configurable processors
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Hi-index | 0.00 |
Recent study shows that a further speedup can be achieved by RISC-based extensible processors if the incorporated custom functional units (CFUs) can execute functions with more than two inputs and one output. However, mechanisms to execute multiple-input, multiple-output (MIMO) custom functions in a RISC processor have not been addressed. This paper proposes an extension for single-issue RISC processors based on a CFU that can execute custom functions with up to six inputs and three outputs. To minimize the change to the core processor, we maintain the operand bandwidth of two inputs, one output per cycle and transfer the extra operands and results using repeated custom instructions. While keeping such an limit sacrifices some speedup, our experiments show that the MIMO extension can still achieve an average 51% increase in speedup compared to a dual-input, single-output (DISO) extension and an average 27% increase in speedup compared to a multiple-input, single-output (MISO) extension.