Proceedings of the 24th annual international symposium on Computer architecture
Streamlining inter-operation memory communication via data dependence prediction
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Memory dependence prediction using store sets
Proceedings of the 25th annual international symposium on Computer architecture
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Early load address resolution via register tracking
Proceedings of the 27th annual international symposium on Computer architecture
Register integration: a simple and efficient implementation of squash reuse
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Load and store reuse using register file contents
ICS '01 Proceedings of the 15th international conference on Supercomputing
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Implementing optimizations at decode time
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Performance characterization of a hardware mechanism for dynamic optimization
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Dynamic dead-instruction detection and elimination
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
IEEE Transactions on Computers
Cherry: checkpointed early resource recycling in out-of-order microprocessors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Three extensions to register integration
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Exploiting Value Locality in Physical Register Files
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Macro-op Scheduling: Relaxing Scheduling Loop Constraints
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Using Dynamic Binary Translation to Fuse Dependent Instructions
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization
Proceedings of the 32nd annual international symposium on Computer Architecture
Dynamically reducing pressure on the physical register file through simple register sharing
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Self-checking instructions: reducing instruction redundancy for concurrent error detection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Branch predictor guided instruction decoding
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
NoSQ: Store-Load Communication without a Store Queue
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
IEEE Transactions on Computers
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Watchdog: hardware for safe and secure manual memory management and full memory safety
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
RENO is a modified MIPS R10000 register renamer that uses map-table "short-circuiting" to implement dynamic versions of several well-known static optimizations: move elimination, common subexpression elimination, register allocation, and constant folding. Because it implements these optimizations dynamically, RENO can apply optimizations in certain situations where static compilers cannot. Several of RENOýs component optimizations have been previously proposed as independent mechanisms. Unified renaming [13] implements dynamic move elimination and speculative memory bypassing [19] (the dynamic counterpart of register allocation). Register integration [21] implements common-subexpression elimination and speculative memory bypassing. RENO unifies these mechanisms and adds a dynamic version of constant folding, RENOCF. RENOCF uses an extended map table format and a limited form of dynamic operation fusion. Cycle-level simulation shows that RENO dynamically eliminates (i.e., optimizes away) 22% of the dynamic instructions in both SPECint2000 and MediaBench. RENOCF is responsible for 12% and 17% of the eliminations, respectively. Because dataflow dependences are collapsed around eliminated instructions, performance improves by 8% and 13%, respectively. Alternatively, because eliminated instructions do not consume issue queue entries, physical registers, or issue, bypass, register file, and execution bandwidth, RENO can be used to absorb the performance impact of a significantly scaled-down execution core.