A unified vector/scalar floating-point architecture

Authors:
N. P. Jouppi;J. Bertoni;D. W. Wall
Affiliations:
Digital Equipment Corporation, Western Research Lab;Digital Equipment Corporation, Western Research Lab;Digital Equipment Corporation, Western Research Lab
Venue:
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Year:
1989

Citing 8
Cited 8

Advanced compiler optimizations for supercomputers

Communications of the ACM - Special issue on parallelism
The Mahler experience: using an intermediate language as the machine description

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Performance of various computers using standard linear equations software in a FORTRAN environment

ACM SIGARCH Computer Architecture News
The IBM System/370 Vector Architecture: Design Considerations

IEEE Transactions on Computers
Cache performance of vector processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
An evaluation of Cray X-MP performance on vectorizable Livermore FORTRAN kernels

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Cache Memories

ACM Computing Surveys (CSUR)
On the design of high performance digital arithmetic units

On the design of high performance digital arithmetic units

Architectural and organizational tradeoffs in the design of the MultiTitan CPU

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Multi-threaded vectorization

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Benchmarking a vector-processor prototype based on multithreaded streaming/FIFO vector (MSFV) architecture

ICS '92 Proceedings of the 6th international conference on Supercomputing
Pseudo vector processor based on register-windowed superscalar pipeline

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A scalar architecture for pseudo vector processing based on slide-windowed registers

ICS '93 Proceedings of the 7th international conference on Supercomputing
Superscalar Instruction Issue

IEEE Micro
VICTORIA: VMX indirect compute technology oriented towards in-line acceleration

Proceedings of the 3rd conference on Computing frontiers
ALP: Efficient support for all levels of parallelism for complex media applications

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The goal of this architecture is to yield improved scalar performance while broadening the range of vectorizable applications. For example, reduction operations and recurrences can be expressed in vector form in this architecture. This approach results in greater overall performance for most applications than does the approach of emphasizing peak vector performance. The hardware required to support the enhanced vector capability is insignificant, but allows the execution of two operations per cycle for vectorized code. Moreover, the size of the unified vector/scalar register file required for peak performance is an order of magnitude smaller than traditional vector register files, allowing efficient on-chip VLSI implementation. The results of simulations of the Livermore Loops and Linpack using this architecture are presented.