Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Highly accurate data value prediction using hybrid predictors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Putting the fill unit to work: dynamic optimizations for trace cache microprocessors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Value speculation scheduling for high performance processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The Jalapeño dynamic optimizing compiler for Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
A hardware mechanism for dynamic extraction and relayout of program hot spots
Proceedings of the 27th annual international symposium on Computer architecture
Proceedings of the 27th annual international symposium on Computer architecture
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Practicing JUDO: Java under dynamic optimizations
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Predictor-directed stream buffers
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
rePLay: A Hardware Framework for Dynamic Optimization
IEEE Transactions on Computers
IEEE Micro
IEEE Micro
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Code Reordering and Speculation Support for Dynamic Optimization System
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Code Specialization Based on Value Profiles
SAS '00 Proceedings of the 7th International Symposium on Static Analysis
Pointer cache assisted prefetching
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Instruction Pre-Processing in Trace Processors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Hardware Support for Control Transfers in Code Caches
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Exploring Code Cache Eviction Granularities in Dynamic Optimization Systems
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Power Awareness through Selective Dynamically Optimized Traces
Proceedings of the 31st annual international symposium on Computer architecture
Runtime specialization with optimistic heap analysis
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
The java hotspotTM server compiler
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
Runtime specialization with optimistic heap analysis
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A Self-Repairing Prefetcher in an Event-Driven Dynamic Optimization Framework
Proceedings of the International Symposium on Code Generation and Optimization
Performance driven data cache prefetching in a dynamic software optimization system
Proceedings of the 21st annual international conference on Supercomputing
Thread warping: a framework for dynamic synthesis of thread accelerators
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Accurate branch prediction for short threads
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Scenario Based Optimization: A Framework for Statically Enabling Online Optimizations
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Dynamic performance tuning for speculative threads
Proceedings of the 36th annual international symposium on Computer architecture
MTCrossBit: A Dynamic Binary Translation System Using Multithreaded Optimization Framework
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Feedback-directed specialization of code
Computer Languages, Systems and Structures
A cross-layer approach to heterogeneity and reliability
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Dynamic binary translation specialized for embedded systems
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Improving instrumentation speed via buffering
Proceedings of the Workshop on Binary Instrumentation and Applications
Software data spreading: leveraging distributed caches to improve single thread performance
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Runtime parallelization of legacy code on a transactional memory system
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Loaf: a framework and infrastructure for creating online adaptive solutions
Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
A dynamic optimization framework for OpenMP
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Improving performance through deep value profiling and specialization with code transformation
Computer Languages, Systems and Structures
DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Background optimization in full system binary translation
Programming and Computing Software
Dynamically dispatching speculative threads to improve sequential execution
ACM Transactions on Architecture and Code Optimization (TACO)
Coalition threading: combining traditional andnon-traditional parallelism to maximize scalability
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
A low-overhead dynamic optimization framework for multicores
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
Dynamic optimization has the potential to adapt the programýs behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usually perform their optimization in series with the applicationýs execution. This incurs overhead which reduces the benefit of dynamic optimization, and prevents some aggressive optimizations from being performed. In this paper we propose a new dynamic optimization framework called Trident. Concurrent with the programýs execution, the framework uses hardware support to identify optimization opportunities, and uses spare threads on a multithreaded processor to perform dynamic optimizations for these optimization events. We evaluate the benefit of using Trident to guide code layout, basic compiler optimizations, and value specialization. Our results show that using Trident with these optimizations achieves an average 20% speedup, and is complementary with other memory latency tolerant techniques, such as prefetching.