Multiscalar Execution along a Single Flow of Control

Authors:
Krishna K. Sundararaman;Manoj Franklin
Affiliations:
-;-
Venue:
ICPP '97 Proceedings of the international Conference on Parallel Processing
Year:
1997

Citing 13
Cited 4

Micro 2000

IEEE Spectrum
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Alternative implementations of two-level adaptive branch prediction

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Characterizing the impact of predicated execution on branch prediction

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The multiscalar architecture

The multiscalar architecture
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Alternative implementations of hybrid branch predictors

Proceedings of the 28th annual international symposium on Microarchitecture
Control flow prediction with tree-like subgraphs for superscalar processors

Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References

IEEE Transactions on Computers
Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture

An empirical study of decentralized ILP execution models

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A Trace Cache Microarchitecture and Evaluation

IEEE Transactions on Computers - Special issue on cache memory and related problems
Typing the ISA to cluster the processor

Future Generation Computer Systems - Parallel computing technologies (PaCT-2001)
Typing the ISA to Cluster the Processor

PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The multiscalar processing model extracts instruction level parallelism from ordinary programs by splitting the program into smaller, possibly dependent, tasks, and parallelly executing multiple tasks using multiple execution units. Past work had advocated pursuing multiple flows of control in the multiscalar processor. We first illustrate the problems involved in pursuing multiple flows of control. We then discuss a methodology to obtain good performance from multiple tasks extracted from a single line of control. We also present the results of simulation studies that verify the potential of this method. These results, obtained with the SPEC92 benchmarks, show better issue rates when a single line of control is pursued in the multiscalar processor. The primary reason for this improvement is the ability to have better load balancing among the execution units.