A unified vector/scalar floating-point architecture
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Register connection: a new approach to adding registers into instruction set architectures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Adding a vector unit to a superscalar processor
ICS '99 Proceedings of the 13th international conference on Supercomputing
Evaluating the Use of Register Queues in Software Pipelined Loops
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Tarantula: a vector extension to the alpha architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Vectorizing for a SIMdD DSP architecture
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
An innovative low-power high-performance programmable signal processor for digital communications
IBM Journal of Research and Development
Compiling for an indirect vector register architecture
Proceedings of the 5th conference on Computing frontiers
SARA: StreAm register allocation
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Performance and power evaluation of an in-line accelerator
Proceedings of the 7th ACM international conference on Computing frontiers
Hi-index | 0.00 |
There is increasing interest in the use of accelerators in computer systems. Accelerators are processor-attached hardware units that can perform certain functions faster than the conventional general purpose processor. In this paper, we describe the VICTORIA PowerPC architecture, which is based on the iVMX accelerator technology. The iVMX accelerator extends the existing VMX architecture with indirect register addressing. That approach greatly extends the architected space of registers and opens the door for highly optimized vector algorithms that can sustain very high processing rates. The large space of registers is directly controlled by the executing code and offers a sufficiently large storage to hold sizeable intermediate results. This helps reduce the negative effects of limited memory bandwidth and high memory latency. The iVMX accelerator is an example of in-line accelerator; that is, the instructions that drive the accelerator are part of the same stream that drives the main processor. Compared to off-line accelerators, which execute their own instruction stream, in-line accelerators present a much more convenient programming model.