The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Understanding the backward slices of performance degrading instructions
Proceedings of the 27th annual international symposium on Computer architecture
On the value locality of store instructions
Proceedings of the 27th annual international symposium on Computer architecture
A study of slipstream processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Slice-processors: an implementation of operation-based prediction
ICS '01 Proceedings of the 15th international conference on Supercomputing
Execution-based prediction using speculative slices
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Slipstream processors: improving both performance and fault tolerance
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Data prefetching by dependence graph precomputation
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Dynamic speculative precomputation
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Speculative Data-Driven Multithreading
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Trace processors: exploiting hierarchy and speculation
Trace processors: exploiting hierarchy and speculation
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
SlicK: slice-based locality exploitation for efficient redundant multithreading
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Lazy instruction scheduling: keeping performance, reducing power
Proceedings of the 13th international symposium on Low power electronics and design
Hi-index | 14.98 |
A slipstream processor accelerates a program by speculatively removing repeatedly ineffectual instructions. Detecting the roots of ineffectual computation驴unreferenced writes, nonmodifying writes, and correctly predicted branches驴is straightforward. On the other hand, detecting ineffectual instructions in the backward slices of these root instructions currently requires complex back-propagation circuitry. We observe that, by logically monitoring the speculative program (instead of the original program), back-propagation can be reduced to detecting unreferenced writes. That is, once root instructions are actually removed, instructions at the next higher level in the backward slice become newly exposed unreferenced writes in the speculative program. This new algorithm, called implicit back-propagation, eliminates complex hardware and achieves an average performance improvement of 11.8 percent, only marginally lower than the 12.3 percent improvement achieved with explicit back-propagation. We further simplify the hardware component by electing not to detect ineffectual memory writes, focusing only on ineffectual register writes. A minimal implementation consisting of only a register-indexed table (similar to an architectural register file) achieves a good balance between complexity and performance (11.2 percent average performance improvement with implicit back-propagation and without detection of ineffectual memory writes).