Fighting the memory wall with assisted execution

Authors:
Michel Dubois
Affiliations:
University of Southern California, Los Angeles, CA
Venue:
Proceedings of the 1st conference on Computing frontiers
Year:
2004

Citing 14
Cited 3

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Informing memory operations: providing memory performance feedback in modern processors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The interaction of software prefetching with ILP processors in shared-memory systems

Proceedings of the 24th annual international symposium on Computer architecture
Tolerating late memory traps in ILP processors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
The use of multithreading for exception handling

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Execution-based prediction using speculative slices

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Speculative precomputation: long-range prefetching of delinquent loads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Difficult-path branch prediction using subordinate microthreads

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Simultaneous Multithreading: A Platform for Next-Generation Processors

IEEE Micro
Hybrid compiler/hardware prefetching for multiprocessors using low-overhead cache miss traps

ICPP '97 Proceedings of the international Conference on Parallel Processing
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Hardware Support for Prescient Instruction Prefetch

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture

Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
"Flea-flicker" Multipass Pipelining: An Alternative to the High-Power Out-of-Order Offense

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Assisted execution is a form of simultaneous multithreading in which a set of auxiliary "assistant" threads, called nanothreads, is attached to each thread of an application. Nanothreads are lightweight threads which run on the same processor as the main (application) thread and help execute the main thread as fast as possible. Nanothreads exploit resources that are idled in the processor because of hazards due to program dependencies and memory access delays.Assisted execution has the potential to alter the current trade-offs between static and dynamic execution mechanisms. Nanothreads can monitor and reconfigure the underlying hardware, can emulate hardware and can profile applications with little or no interference to improve the program on-line or off-line.We demonstrate the power of assisted execution with an important application, namely data prefetching to fight the memory wall problem. Simulation results on several SPEC95 benchmarks show that sequential and stride prefetching implemented with nanothreads performs just as well as ideal hardware prefetchers.