HPS, a new microarchitecture: rationale and introduction

Authors:
Y. N. Patt;W. M. Hwu;M. Shebanow
Affiliations:
Computer Science Division, University of California, Berkeley, Berkeley, CA;Computer Science Division, University of California, Berkeley, Berkeley, CA;Computer Science Division, University of California, Berkeley, Berkeley, CA
Venue:
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Year:
1985

Citing 4
Cited 68

Critical issues regarding HPS, a high performance microarchitecture

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Look-Ahead Processors

ACM Computing Surveys (CSUR)
A preliminary architecture for a basic data-flow processor

ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
CEDAR: a large scale multiprocessor

ACM SIGARCH Computer Architecture News

HPSm, a high performance restricted data flow architecture having minimal functionality

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Critical issues regarding HPS, a high performance microarchitecture

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
A development environment for horizontal microcode programs

MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Run-time generation of HPS microinstructions from a VAX instruction stream

MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Data flow graph partitioning to reduce communication cost

MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Aquarius

ACM SIGARCH Computer Architecture News
Checkpoint repair for out-of-order execution machines

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Reducing execution parameter through correspondence in computer architecture

IBM Journal of Research and Development
Checkpoint repair for high-performance out-of-order execution machines

IEEE Transactions on Computers
The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Computer
Exploiting horizontal and vertical concurrency via the HPSm microprocessor

ACM SIGMICRO Newsletter
On the combination of hardware and software concurrency extraction methods

ACM SIGMICRO Newsletter
Implementing a Prolog machine with multiple functional units

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Multiple instruction issue and single-chip processors

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
I-NET mechanism for issuing multiple instructions

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Tradeoffs in instruction format design for horizontal architectures

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A high performance Prolog processor with multiple function units

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

IEEE Transactions on Computers
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A Theory of Reduced and Minimal Procedural Dependencies

IEEE Transactions on Computers
IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Single instruction stream parallelism is greater than two

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Exploiting fine-grained parallelism through a combination of hardware and software techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
The effect of real data cache behavior on the performance of a microarchitecture that supports dynamic scheduling

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Distributed Instruction Set Computer Architecture

IEEE Transactions on Computers
Effects of building blocks on the performance of super-scalar architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Concurrency Extraction Via Hardware Methods Executing the Static Instruction Stream

IEEE Transactions on Computers
Register requirements of pipelined processors

ICS '92 Proceedings of the 6th international conference on Supercomputing
An investigation of the performance of various dynamic scheduling techniques

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Extraction of massive instruction level parallelism

ACM SIGARCH Computer Architecture News
Shared memory consistency conditions for non-sequential execution: definitions and programming strategies

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Facilitating superscalar processing via a combined static/dynamic register renaming scheme

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Improving CISC instruction decoding performance using a fill unit

Proceedings of the 28th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via block-structured instruction set architectures

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A study on the number of memory ports in multiple instruction issue machines

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparative performance evaluation of various state maintenance mechanisms

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
On the combination of hardware and software concurrency extraction methods

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Exploiting horizontal and vertical concurrency via the HPSm microprocessor

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
On tuning the microarchitecture of an HPS implementation of the VAX

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences

Proceedings of the 24th annual international symposium on Computer architecture
Target prediction for indirect jumps

Proceedings of the 24th annual international symposium on Computer architecture
Alternative fetch and issue policies for the trace cache fetch mechanism

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Retrospective: HPSm, a high performance restricted data flow architecture having minimal functionality

25 years of the international symposia on Computer architecture (selected papers)
IMPACT: an architectural framework for multiple-instruction-issue processors

25 years of the international symposia on Computer architecture (selected papers)
A novel renaming scheme to exploit value temporal locality through physical register reuse and unification

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Control flow optimization for supercomputer scalar processing

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Performance benefits of large execution atomic units in dynamically scheduled machines

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Aggressive Dynamic Execution of Decoded Traces

Journal of VLSI Signal Processing Systems - Special issue on the 1997 IEEE workshop on signal processing systems (SiPS): design and implementation
Allowing for ILP in an embedded Java processor

Proceedings of the 27th annual international symposium on Computer architecture
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures

International Journal of Parallel Programming
Guest Editor's Introduction Real Machines: Design Choices/Engineering Trade-Offs

Computer
One Billion Transistors, One Uniprocessor, One Chip

Computer
Efficient Instruction Sequencing with Inline Target Insertion

IEEE Transactions on Computers
Instruction Window Size Trade-Offs and Characterization of Program Parallelism

IEEE Transactions on Computers
A Development Environment for Horizontal Microcode

IEEE Transactions on Software Engineering
Aggressive Dynamic Execution of Multimedia Kernel Traces

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Controlling the data space of tree structured computations

Information and Computation
Coherence decoupling: making use of incoherence

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Task superscalar: using processors as functional units

HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Compilers, architectures and synthesis for embedded computing: retrospect and prospect

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Task Superscalar: An Out-of-Order Task Pipeline

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.02

Visualization

Abstract

HPS (High Performance Substrate) is a new microarchitecture targeted for implementing very high performance computing engines. Our model of execution is a restriction on fine granularity data flow. This paper introduces the model, provides the rationale for its selection, and describes the data path and flow of instructions through the microengine.