Maintaining Consistency and Bounding Capacity of Software Code Caches

Authors:
Derek Bruening;Saman Amarasinghe
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory and Determina Corporation;MIT Computer Science and Artificial Intelligence Laboratory and Determina Corporation
Venue:
Proceedings of the international symposium on Code generation and optimization
Year:
2005

Citing 23
Cited 22

Memory access buffering in multiprocessors

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
A portable interface for on-the-fly instruction space modification

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Embra: fast and flexible machine simulation

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
DAISY: dynamic compilation for 100% architectural compatibility

Proceedings of the 24th annual international symposium on Computer architecture
Disco: running commodity operating systems on scalable multiprocessors

Proceedings of the sixteenth ACM symposium on Operating systems principles
Fast, effective code generation in a just-in-time Java compiler

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Execution characteristics of desktop applications on Windows NT

Proceedings of the 25th annual international symposium on Computer architecture
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Efficient representations and abstractions for quantifying and exploiting data reference locality

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Windows NT/2000 Native API Reference

Windows NT/2000 Native API Reference
Linkers and Loaders

Linkers and Loaders
Managing multi-configuration hardware via dynamic working set analysis

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
PA-RISC to IA-64: Transparent Execution, No Recompilation

Computer
DELI: a new run-time control point

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Retargetable and reconfigurable software dynamic translation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Code Cache Management Schemes for Dynamic Optimizers

INTERACT '02 Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures
Exploring Code Cache Eviction Granularities in Dynamic Optimization Systems

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Locality phase prediction

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Efficient, transparent, and comprehensive runtime code manipulation

Efficient, transparent, and comprehensive runtime code manipulation

Tdb: a source-level debugger for dynamically translated programs

Proceedings of the sixth international symposium on Automated analysis-driven debugging
A Cross-Architectural Interface for Code Cache Manipulation

Proceedings of the International Symposium on Code Generation and Optimization
Thread-Shared Software Code Caches

Proceedings of the International Symposium on Code Generation and Optimization
A dynamic binary instrumentation engine for the ARM architecture

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
HDTrans: an open source, low-level dynamic instrumentation system

Proceedings of the 2nd international conference on Virtual execution environments
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems

Proceedings of the International Symposium on Code Generation and Optimization
Fragment cache management for dynamic binary translators in embedded systems with scratchpad

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Process-shared and persistent code caches

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Reducing pressure in bounded DBT code caches

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Heterogeneous code cache: using scratchpad and main memory in dynamic binary translators

Proceedings of the 46th Annual Design Automation Conference
DBT path selection for holistic memory efficiency and performance

Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
A self-adjusting code cache manager to balance start-up time and memory usage

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Generating low-overhead dynamic binary translators

Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Balancing memory and performance through selective flushing of software code caches

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Improving the performance of trace-based systems by false loop filtering

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Enhanced heterogeneous code cache management scheme for dynamic binary translation

Proceedings of the 16th Asia and South Pacific Design Automation Conference
Evaluating indirect branch handling mechanisms in software dynamic translation systems

ACM Transactions on Architecture and Code Optimization (TACO)
Process-level virtualization for runtime adaptation of embedded software

Proceedings of the 48th Design Automation Conference
Reducing trace selection footprint for large-scale Java applications without performance loss

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Transparent dynamic instrumentation

VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Memory optimization of dynamic binary translators for embedded systems

ACM Transactions on Architecture and Code Optimization (TACO)
Enabling dynamic binary translation in embedded systems with scratchpad memory

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software code caches are becoming ubiquitous, in dynamic optimizers, runtime tool platforms, dynamic translators, fast simulators and emulators, and dynamic compilers. Caching frequently executed fragments of code provides significant performance boosts, reducing the overhead of translation and emulation and meeting or exceeding native performance in dynamic optimizers. One disadvantage of caching, memory expansion, can sometimes be ignoredwhen executing a single application. However, as optimizers and translators are applied more and more in production systems, the memory expansion from running multiple applications simultaneously becomes problematic. A second drawback to caching is the addedrequirement of maintaining consistency between the code cache and the original code. On architectures like IA-32 that do not require explicit application actions when modifying code, detecting code changes is challenging. Again, consistency can be ignored for certain sets of applications, but as caching systems scale up to executing large, modern, complex programs, consistency becomes critical. This paper presents efficient schemes for keeping a software code cache consistent and for dynamically bounding code cache size to match the current working set of the application. These schemes are evaluated in the DynamoRIO runtime code manipulation system, and operate on stock hardware in the presence of multiple threads and dynamic behavior, including dynamically-loaded, generated, and even modified code.