A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Recovery Mechanism for Latency Misprediction
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Scaling the issue window with look-ahead latency prediction
Proceedings of the 18th annual international conference on Supercomputing
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Out-of-Order Commit Processors
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Hi-index | 0.00 |
In the presence of a long-latency instruction as a L2 miss, the issue queue (IQ) may fill with instructions dependent on the L2 miss; consequently, the IQ will not expose instruction-level parallelism until resolving the miss. In the scope of memory-latency tolerant processors, we propose delaying the insertion into the IQ of the instructions dependent on load instructions predicted to miss L2. These instructions will be stored in an instruction buffer instead of being inserted in the IQ. After resolving the L2 miss, the dependent instructions will be inserted into the IQ. Results show that the proposal reduces the total number of replays from 37% (integer benchs) to 61% (floating-point benchs), the average performance degradation is, at most, 2%, and the average overall-chip energy-consumption reduction is around 8% in FP benchs.