ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Instruction issue logic for high-performance, interruptable pipelined processors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Checkpoint repair for high-performance out-of-order execution machines
IEEE Transactions on Computers
Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
IEEE Transactions on Computers
Boosting beyond static scheduling in a superscalar processor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A brief survey of papers on scheduling for pipelined processors
ACM SIGPLAN Notices
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A Development Environment for Horizontal Microcode
IEEE Transactions on Software Engineering
Performance enhancement of SISD processors
ISCA '79 Proceedings of the 6th annual symposium on Computer architecture
Instruction issue logic for pipelined supercomputers
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Super-Scalar Processor Design
Performance analysis and design methodology for a scalable superscalar architecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The 16-fold way: a microparallel taxonomy
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Hi-index | 0.01 |
A new superscalar processor architecture, called DSNS (Dynamically-hazard-resolved, Statically-code-scheduled, Nonuniform Superscalar), is proposed and discussed. DSNS has the following major architectural features.1. Dynamically-hazard-resolved superscalar: DSNS is object-code compatible with respect to the degree of superscalar. Pipeline interlock hardware should be provided for detecting and resolving hazards at run time.2. Statically-cade-scheduled superscalar: The performance of DSNS could not be scalable with respect to the degree of superscalar. Compilers must be responsible for scheduling instructions to reduce the pipeline stalls for a particular degree of superscalar.3. Nonuniform superscalar: Although nonuniform superscalar potentially suffers instruction-class conflicts, it can be more cost-effective than uniform superscalar. Again compilers must take care that the class conflicts do not increase structural hazards.4. Static memory disambiguation: The DSNS architecture provides three types of LOAD/STORE instructions; strongly ordered, weakly ordered, and unordered. Memory disambiguation at compile time is responsible for marking each LOAD/STORE instruction. At run time, processors need not detect nor resolve data hazards for every type; they just perform memory accesses inorder for strongly or weakly ordered instructions, and arbitrarily for unordered.5. Static branch prediction with branch-target buffer: Branch instructions predicted as taken by compilers are stored in the branch target buffer. Hardware never guesses the outcomes of branch instructions.6. Early branch resolution with advanced conditioning: Advanced conditioning allows branch decisions to precede further the corresponding branches. It reduces the branch delay and results in resolving branches early in the pipeline.7. Conditional mode execution with dual register files: Dual register file facilitates maintaining the precise machine state that otherwise might be violated by speculative execution such as conditional mode.8. Weakly precise interrupts: The DSNS architecture defines interrupts as being somewhat imprecise but restartable with the help of interrupt handlers. The definition alleviates hardware constraints for ensuring precise interrupts strongly.This paper also presents an implementation of the DSNS architecture. The DSNS processor prototype under development is a four-stage pipelined processor of superscalar-degree four. The instruction pipelines, especially the branch pipeline, are discussed in detail.