Critical issues regarding HPS, a high performance microarchitecture
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
ACM Computing Surveys (CSUR)
A preliminary architecture for a basic data-flow processor
ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
CEDAR: a large scale multiprocessor
ACM SIGARCH Computer Architecture News
HPSm, a high performance restricted data flow architecture having minimal functionality
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Critical issues regarding HPS, a high performance microarchitecture
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
A development environment for horizontal microcode programs
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Run-time generation of HPS microinstructions from a VAX instruction stream
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Data flow graph partitioning to reduce communication cost
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
ACM SIGARCH Computer Architecture News
Checkpoint repair for out-of-order execution machines
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Reducing execution parameter through correspondence in computer architecture
IBM Journal of Research and Development
Checkpoint repair for high-performance out-of-order execution machines
IEEE Transactions on Computers
Exploiting horizontal and vertical concurrency via the HPSm microprocessor
ACM SIGMICRO Newsletter
On the combination of hardware and software concurrency extraction methods
ACM SIGMICRO Newsletter
Implementing a Prolog machine with multiple functional units
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Multiple instruction issue and single-chip processors
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
I-NET mechanism for issuing multiple instructions
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Tradeoffs in instruction format design for horizontal architectures
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A high performance Prolog processor with multiple function units
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
IEEE Transactions on Computers
High-bandwidth data memory systems for superscalar processors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A Theory of Reduced and Minimal Procedural Dependencies
IEEE Transactions on Computers
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Exploiting fine-grained parallelism through a combination of hardware and software techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Distributed Instruction Set Computer Architecture
IEEE Transactions on Computers
Effects of building blocks on the performance of super-scalar architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Concurrency Extraction Via Hardware Methods Executing the Static Instruction Stream
IEEE Transactions on Computers
Register requirements of pipelined processors
ICS '92 Proceedings of the 6th international conference on Supercomputing
An investigation of the performance of various dynamic scheduling techniques
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Extraction of massive instruction level parallelism
ACM SIGARCH Computer Architecture News
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Facilitating superscalar processing via a combined static/dynamic register renaming scheme
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Improving CISC instruction decoding performance using a fill unit
Proceedings of the 28th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via block-structured instruction set architectures
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A study on the number of memory ports in multiple instruction issue machines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparative performance evaluation of various state maintenance mechanisms
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
On the combination of hardware and software concurrency extraction methods
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Exploiting horizontal and vertical concurrency via the HPSm microprocessor
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
On tuning the microarchitecture of an HPS implementation of the VAX
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences
Proceedings of the 24th annual international symposium on Computer architecture
Target prediction for indirect jumps
Proceedings of the 24th annual international symposium on Computer architecture
Alternative fetch and issue policies for the trace cache fetch mechanism
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
25 years of the international symposia on Computer architecture (selected papers)
IMPACT: an architectural framework for multiple-instruction-issue processors
25 years of the international symposia on Computer architecture (selected papers)
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Control flow optimization for supercomputer scalar processing
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Performance benefits of large execution atomic units in dynamically scheduled machines
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Aggressive Dynamic Execution of Decoded Traces
Journal of VLSI Signal Processing Systems - Special issue on the 1997 IEEE workshop on signal processing systems (SiPS): design and implementation
Allowing for ILP in an embedded Java processor
Proceedings of the 27th annual international symposium on Computer architecture
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures
International Journal of Parallel Programming
Efficient Instruction Sequencing with Inline Target Insertion
IEEE Transactions on Computers
Instruction Window Size Trade-Offs and Characterization of Program Parallelism
IEEE Transactions on Computers
A Development Environment for Horizontal Microcode
IEEE Transactions on Software Engineering
Aggressive Dynamic Execution of Multimedia Kernel Traces
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Controlling the data space of tree structured computations
Information and Computation
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Task superscalar: using processors as functional units
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Compilers, architectures and synthesis for embedded computing: retrospect and prospect
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Task Superscalar: An Out-of-Order Task Pipeline
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.02 |
HPS (High Performance Substrate) is a new microarchitecture targeted for implementing very high performance computing engines. Our model of execution is a restriction on fine granularity data flow. This paper introduces the model, provides the rationale for its selection, and describes the data path and flow of instructions through the microengine.