PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Profile-assisted instruction scheduling
International Journal of Parallel Programming
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Exploiting dead value information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Resource-sensitive profile-directed data flow analysis for code optimization
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
On the value locality of store instructions
Proceedings of the 27th annual international symposium on Computer architecture
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Three Architectural Models for Compiler-Controlled Speculative Execution
IEEE Transactions on Computers
Characterizing and predicting value degree of use
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 32nd annual international symposium on Computer Architecture
RENO: A Rename-Based Instruction Optimizer
Proceedings of the 32nd annual international symposium on Computer Architecture
Self-checking instructions: reducing instruction redundancy for concurrent error detection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Early Register Deallocation Mechanisms Using Checkpointed Register Files
IEEE Transactions on Computers
Lazy instruction scheduling: keeping performance, reducing power
Proceedings of the 13th international symposium on Low power electronics and design
Slim VM: optimistic partial program loading for connected embedded Java virtual machines
Proceedings of the 6th international symposium on Principles and practice of programming in Java
SlimVM: a small footprint Java virtual machine for connected embedded systems
PPPJ '09 Proceedings of the 7th International Conference on Principles and Practice of Programming in Java
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Using hardware vulnerability factors to enhance AVF analysis
Proceedings of the 37th annual international symposium on Computer architecture
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
On the exploitation of narrow-width values for improving register file reliability
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
“Slimming” a Java virtual machine by way of cold code removal and optimistic partial program loading
Science of Computer Programming
Exploring the potential of architecture-level power optimizations
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
DeadSpy: a tool to pinpoint program inefficiencies
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
We observe a non-negligible fraction--3 to 16% in our benchmarks--of dynamically dead instructions, dynamic instruction instances that generate unused results. The majority of these instructions arise from static instructions that also produce useful results. We find that compiler optimization (specifically instruction scheduling) creates a significant portion of these partially dead static instructions. We show that most of the dynamically instructions arise from a small set of static instructions that produce dead values most of the time.We leverage this locality by proposing a dead instruction predictor and presenting a scheme to avoid the execution of predicted-dead instructions. Our predictor achieves an accuracy of 93% while identifying over 91% of the dead instructions using less than 5 KB of state. We achieve such high accuracies by leveraging future control flow information (i.e., branch predictions) to distinguish between useless and useful instances of the same static instruction.We then present a mechanism to avoid the register allocation, instruction scheduling, and execution of predicted dead instructions. We measure reductions in resource utilization averaging over 5% and sometimes exceeding 10%, covering physical register management (allocation and freeing), register file read and write traffic, and data cache accesses. Performance improves by an average of 3.6% on an architecture exhibiting resource contention. Additionally, our scheme frees future compilers from the need to consider the costs of dead instructions, enabling more aggressive code motion and optimization. Simultaneously, it mitigates the need for good path profiling information in making inter-block code motion decisions.