Decoupled access/execute computer architectures

Authors:
James E. Smith
Affiliations:
Department of Electrical and Computer Engineering, University of Wisconsin, Madison, WI
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
1984

Citing 5
Cited 46

Decoupled access/execute computer architectures

ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Coding guidelines for pipelined processors

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Information content of CPU memory referencing behavior

ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Design of a Computer—The Control Data 6600

Design of a Computer—The Control Data 6600
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference

A Simulation Study of Decoupled Architecture Computers

IEEE Transactions on Computers
A study of scalar compilation techniques for pipelined supercomputers

ACM Transactions on Mathematical Software (TOMS)
A Performance Comparison of the IBM RS/6000 and the Astronautics ZS-1

Computer - Special issue on experimental research in computer architecture
Code generation for streaming: an access/execute mechanism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Memory latency effects in decoupled architectures with a single data memory module

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluation of the WM architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Register requirements of pipelined processors

ICS '92 Proceedings of the 6th international conference on Supercomputing
The effectiveness of decoupling

ICS '93 Proceedings of the 7th international conference on Supercomputing
Effects of memory latencies on non-blocking processor/cache architectures

ICS '93 Proceedings of the 7th international conference on Supercomputing
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences

Proceedings of the 24th annual international symposium on Computer architecture
Dynamic vectorization: a mechanism for exploiting far-flung ILP in ordinary programs

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
An investigation of static versus dynamic scheduling

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
PIPE: a VLSI decoupled architecture

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Improving Java performance using hardware translation

ICS '01 Proceedings of the 15th international conference on Supercomputing
Improving Latency Tolerance of Multithreading through Decoupling

IEEE Transactions on Computers
MediaBreeze: a decoupled architecture for accelerating multimedia applications

ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
A Simulation Study of Decoupled Vector Architectures

The Journal of Supercomputing
Dynamic Code Partitioning for Clustered Architectures

International Journal of Parallel Programming
Memory Latency Effects in Decoupled Architectures

IEEE Transactions on Computers
Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes

IEEE Transactions on Computers
Non-Consistent Dual Register Files to Reduce Register Pressure

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Decoupled vector architectures

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
A Trace Based Evaluation of Speculative Branch Decoupling

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Overcoming the limitations of conventional vector processors

Proceedings of the 30th annual international symposium on Computer architecture
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements

IEEE Transactions on Computers
Dual-pipeline heterogeneous ASIP design

Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Lessons Learned from Model Checking a NASA Robot Controller

Formal Methods in System Design
Exploiting Coarse-Grain Verification Parallelism for Power-Efficient Fault Tolerance

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Customization of application specific heterogeneous multi-pipeline processors

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Hardware support for software controlled multithreading

ACM SIGARCH Computer Architecture News
Facilitating compiler optimizations through the dynamic mapping of alternate register structures

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Explicit data organization SIMD instruction set architecture for media processors

PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
A highly efficient implementation of back propagation algorithm using matrix instruction set architecture

Neural, Parallel & Scientific Computations
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
A highly efficient implementation of a backpropagation learning algorithm using matrix ISA

Journal of Parallel and Distributed Computing
SoC-C: efficient programming abstractions for heterogeneous multicore systems on chip

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Deriving Efficient Data Movement from Decoupled Access/Execute Specifications

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
A performance-correctness explicitly-decoupled architecture

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
FastCrypto: parallel AES pipelines extension for general-purpose processors

Neural, Parallel & Scientific Computations
Codevelopment of multi-level instruction set architecture and hardware for an efficient matrix processor

Neural, Parallel & Scientific Computations
FPGA implementation and performance evaluation of a high throughput crypto coprocessor

Journal of Parallel and Distributed Computing
TL-DAE: thread-level decoupled access/execution for OpenMP on the cyclops-64 many-core processor

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Mat-core: a decoupled matrix core extension for general-purpose processors

Neural, Parallel & Scientific Computations
Boosting mobile GPU performance with a decoupled access/execute fragment processor

Proceedings of the 39th Annual International Symposium on Computer Architecture
Runtime dependency analysis for loop pipelining in high-level synthesis

Proceedings of the 50th Annual Design Automation Conference
A shared matrix unit for a chip multi-core processor

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.02

Decoupled access/execute computer architectures

Quantified Score

Visualization

Abstract