A Simulation Study of Decoupled Architecture Computers
IEEE Transactions on Computers
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Design of a Computer—The Control Data 6600
Design of a Computer—The Control Data 6600
Tradeoffs in instruction format design for horizontal architectures
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A Performance Comparison of the IBM RS/6000 and the Astronautics ZS-1
Computer - Special issue on experimental research in computer architecture
Code generation for streaming: an access/execute mechanism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Memory latency effects in decoupled architectures with a single data memory module
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluation of the WM architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Prefetching in supercomputer instruction caches
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The effectiveness of decoupling
ICS '93 Proceedings of the 7th international conference on Supercomputing
Designing the TFP Microprocessor
IEEE Micro
Compiling and optimizing for decoupled architectures
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Decoupling integer execution in superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Design and evaluation of dynamic access ordering hardware
ICS '96 Proceedings of the 10th international conference on Supercomputing
Dynamically scheduled VLIW processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparision of superscalar and decoupled access/execute architectures
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Techniques for extracting instruction level parallelism on MIMD architectures
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Address compression through base register caching
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Retrospective: decoupled access/execute architectures
25 years of the international symposia on Computer architecture (selected papers)
Improving Latency Tolerance of Multithreading through Decoupling
IEEE Transactions on Computers
An instruction set and microarchitecture for instruction level distributed processing
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A Simulation Study of Decoupled Vector Architectures
The Journal of Supercomputing
Memory Latency Effects in Decoupled Architectures
IEEE Transactions on Computers
Decoupled vector architectures
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
A Trace Based Evaluation of Speculative Branch Decoupling
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Speculative software management of datapath-width for energy optimization
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
OUTRIDER: efficient memory latency tolerance with decoupled strands
Proceedings of the 38th annual international symposium on Computer architecture
Hi-index | 0.01 |
The Astronautics ZS-1 is a high speed, 64-bit computer system designed for scientific and engineering applications. The ZS-1 central processor uses a decoupled architecture, which splits instructions into two streams---one for fixed point/memory address computation and the other for floating point operations. The two instruction streams are then processed in parallel. Pipelining is also used extensively throughout the ZS-1.This paper describes the architecture and implementation of the ZS-1 central processor, beginning with some of the basic design objectives. Descriptions of the instruction set, pipeline structure, and virtual memory implementation demonstrate the methods used to satisfy the objectives. High performance is achieved through a combination of static (compile-time) instruction scheduling and dynamic (run-time) scheduling. Both types of scheduling are illustrated with examples.