Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
IEEE Transactions on Computers
A variable instruction stream extension to the VLIW architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Using profile information to assist classic code optimizations
Software—Practice & Experience
An elementary processor architecture with simultaneous instruction issuing from multiple threads
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Processor coupling: integrating compile time and runtime scheduling for parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
MISC: a Multiple Instruction Stream Computer
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A study of the EARTH-MANNA multithreaded system
International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques—part II
Multithreading with Distributed Functional Units
IEEE Transactions on Computers
ICS '90 Proceedings of the 4th international conference on Supercomputing
Superblock formation using static program analysis
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
The IA-64 Architecture at Work
Computer
Data Flow and Dependence Analysis for Instruction Level Parallelism
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Toward a General-Purpose Multi-Stream System
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Performance and Hardware Complexity Tradeoffs in Designing Multithreaded Architectures
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Classification and Performance Evaluation of Simultaneous Multithreaded Architectures
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Hi-index | 0.00 |
This paper compares two possible implementations of multithreaded architecture and proposes a new architecture combining the flexibility of the first with the low hardware complexity of the second. We present performance and step-by-step complexity analysis of two design alternatives of multithreaded architecture: dynamic inter-thread resource scheduling and static resource allocation. We then introduce a new multithreaded architecture based on a new scheduling mechanism called the “semi-static.” We show that with two concurrent threads the dynamic scheduling processor achieves from 5 to 45 % higher performance at the cost of much more complicated design. This paper indicates that for a relatively high number of execution resources the complexity of the dynamic scheduling logic will inevitably require design compromises. Moreover, high chip-wide communication time and an incomplete bypassing network will limit the dynamic scheduling and reduce its performance advantage. On the other hand, static scheduling architecture achieves low resource utilization. The semi-static architecture utilizes compiler techniques to exploit patterns of program parallelism and introduces a new hardware mechanism, in order to achieve performance close to dynamic scheduling without significantly increasing the static hardware complexity. The semi-static architecture statically assigns part of the functional units but dynamically schedules the most performance-critical functional units on a medium-grain basis.