Performance Study of a Multithreaded Superscalar Microprocessor

Authors:
Manu Gulati;Nader Bagherzadeh
Affiliations:
-;-
Venue:
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Year:
1996

Citing 10
Cited 21

MASA: a multithreaded processor architecture for parallel symbolic computing

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Can dataflow subsume von Neumann computing?

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Look-Ahead Processors

ACM Computing Surveys (CSUR)
Computer Architecture; A Quantitative Approach

Computer Architecture; A Quantitative Approach
Performance Tradeoffs in Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
SPLASH: Stanford parallel applications for shared-memory*

SPLASH: Stanford parallel applications for shared-memory*
Performance Issues of a Superscalar Microprocessor

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development

Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

ACM Transactions on Computer Systems (TOCS)
Effects of Multithreading on Cache Performance

IEEE Transactions on Computers - Special issue on cache memory and related problems
Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor

ICS '01 Proceedings of the 15th international conference on Supercomputing
Asynchrony in parallel computing: from dataflow to multithreading

Progress in computer research
Asynchrony in parallel computing: from dataflow to multithreading

Progress in computer research
Simultaneous Multithreading: A Platform for Next-Generation Processors

IEEE Micro
A survey of processors with explicit multithreading

ACM Computing Surveys (CSUR)
A Study of a Simultaneous Multithreaded Processor Implementation

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
An Architecture based on the Memory Mapped Node Addressing in Reconfigurable Interconnection Network

PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
The need for adaptive dynamic thread scheduling

High performance scientific and engineering computing
Predictable performance in SMT processors

Proceedings of the 1st conference on Computing frontiers
Dynamically Controlled Resource Allocation in SMT Processors

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Architecture optimization for multimedia application exploiting data and thread-level parallelism

Journal of Systems Architecture: the EUROMICRO Journal
Adaptive dynamic thread scheduling for simultaneous multithreaded architectures with a detector thread

Journal of Parallel and Distributed Computing
Design of adaptive multiprocessor on chip systems

Proceedings of the 20th annual conference on Integrated circuits and systems design
Optimising long-latency-load-aware fetch policies for SMT processors

International Journal of High Performance Computing and Networking
Exploiting multilevel parallelism using OpenMP on a massive multithreaded architecture

Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
Improving SMT performance scheduling processes

EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling

Proceedings of the ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a technique for improving the performance of a superscalar processor through multithreading. The technique exploits the instruction-level parallelism available both inside each individual stream, and across streams. The former is exploited through out-of-order execution of instructions within a stream, and the latter through execution of instructions from different streams simultaneously. Aspects of multithreaded superscalar design, such as fetch policy, cache performance, instruction scheduling, and functional unit utilization are studied. We analyze performance based on the simulation of a superscalar architecture and show that it is possible to provide support for multiple streams with minimal extra hardware, yet achieving significant performance gain (20 - 55%) across a range of benchmarks.