Performance benefits of large execution atomic units in dynamically scheduled machines

Authors:
Stephen W. Melvin;Yale N. Patt
Affiliations:
Computer Science Division, University of California, Berkeley, CA;Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI
Venue:
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Year:
1989

Citing 4
Cited 8

HPSm, a high performance restricted data flow architecture having minimal functionality

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Checkpoint repair for high-performance out-of-order execution machines

IEEE Transactions on Computers
Hardware support for large atomic units in dynamically scheduled machines

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture

Alternative fetch and issue policies for the trace cache fetch mechanism

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Putting the fill unit to work: dynamic optimizations for trace cache microprocessors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A Trace Cache Microarchitecture and Evaluation

IEEE Transactions on Computers - Special issue on cache memory and related problems
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
One Billion Transistors, One Uniprocessor, One Chip

Computer
Selecting long atomic traces for high coverage

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we identify three types of atomic units, or indivisible units of work: architectural atomic units (defined by architecture level interrupts and exceptions), compiler atomic units (defined by compiler code generation) and execution atomic units (defined by run-time interruptibility). We discuss trade-offs for these units and show that size has different performance implications depending on the atomic unit. We simulate a number of different implementations of the VAX architecture, focusing on different execution atomic unit sizes. We show that significant performance benefits can be achieved by having large execution atomic units in dynamically scheduled machines.