DELI: a new run-time control point

Authors:
Giuseppe Desoli;Nikolay Mateev;Evelyn Duesterwald;Paolo Faraboschi;Joseph A. Fisher
Affiliations:
Hewlett-Packard Laboratories, Cambridge, MA;Hewlett-Packard Laboratories, Cambridge, MA;Hewlett-Packard Laboratories, Cambridge, MA;Hewlett-Packard Laboratories, Cambridge, MA;Hewlett-Packard Laboratories, Cambridge, MA
Venue:
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Year:
2002

Citing 15
Cited 44

Executing compressed programs on an embedded RISC architecture

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Embra: fast and flexible machine simulation

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Efficient path profiling

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
DAISY: dynamic compilation for 100% architectural compatibility

Proceedings of the 24th annual international symposium on Computer architecture
Putting the fill unit to work: dynamic optimizations for trace cache microprocessors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Lx: a technology platform for customizable VLIW embedded processing

Proceedings of the 27th annual international symposium on Computer architecture
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Machine-adaptable dynamic binary translation

DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
Java Virtual Machine Specification

Java Virtual Machine Specification
Walk-Time Techniques: Catalyst for Architectural Change

Computer
An API for Runtime Code Patching

International Journal of High Performance Computing Applications
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
User-level resource-constrained sandboxing

WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4

The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
DISE: a programmable macro engine for customizing applications

Proceedings of the 30th annual international symposium on Computer architecture
Generational Cache Management of Code Traces in Dynamic Optimization Systems

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Hardware Support for Control Transfers in Code Caches

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Compact Binaries with Code Compression in a Software Dynamic Translator

Proceedings of the conference on Design, automation and test in Europe - Volume 2
Exploring Code Cache Eviction Granularities in Dynamic Optimization Systems

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
VHC: Quickly Building an Optimizer for Complex Embedded Architectures

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Compiler orchestrated prefetching via speculation and predication

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Maintaining Consistency and Bounding Capacity of Software Code Caches

Proceedings of the international symposium on Code generation and optimization
Planning for code buffer management in distributed virtual execution environments

Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
The design, implementation, and evaluation of adaptive code unloading for resource-constrained devices

ACM Transactions on Architecture and Code Optimization (TACO)
Tdb: a source-level debugger for dynamically translated programs

Proceedings of the sixth international symposium on Automated analysis-driven debugging
Low overhead program monitoring and profiling

PASTE '05 Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
A Cross-Architectural Interface for Code Cache Manipulation

Proceedings of the International Symposium on Code Generation and Optimization
Thread-Shared Software Code Caches

Proceedings of the International Symposium on Code Generation and Optimization
Constructing Virtual Architectures on a Tiled Processor

Proceedings of the International Symposium on Code Generation and Optimization
Quantifying software requirements for supporting archived office documents using emulation

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Branch predictor guided instruction decoding

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Managing bounded code caches in dynamic binary optimization systems

ACM Transactions on Architecture and Code Optimization (TACO)
Software-based instruction caching for embedded processors

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
A dynamic binary instrumentation engine for the ARM architecture

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Framework for instruction-level tracing and analysis of program executions

Proceedings of the 2nd international conference on Virtual execution environments
JIST: Just-In-Time scheduling translation for parallel processors

Scientific Programming
Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications

Proceedings of the International Symposium on Code Generation and Optimization
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance

Proceedings of the International Symposium on Code Generation and Optimization
Performance driven data cache prefetching in a dynamic software optimization system

Proceedings of the 21st annual international conference on Supercomputing
Fragment cache management for dynamic binary translators in embedded systems with scratchpad

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Process-shared and persistent code caches

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Reducing pressure in bounded DBT code caches

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Addressing the challenges of DBT for the ARM architecture

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Dynamic code footprint optimization for the IBM Cell Broadband Engine

IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Dynamically utilizing computation accelerators for extensible processors in a software approach

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
DBT path selection for holistic memory efficiency and performance

Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
A binary instrumentation tool for the Blackfin processor

Proceedings of the Workshop on Binary Instrumentation and Applications
Processor virtualization and split compilation for heterogeneous multicore embedded systems

Proceedings of the 47th Design Automation Conference
Balancing memory and performance through selective flushing of software code caches

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
DisIRer: Converting a retargetable compiler into a multiplatform binary translator

ACM Transactions on Architecture and Code Optimization (TACO)
Enhanced heterogeneous code cache management scheme for dynamic binary translation

Proceedings of the 16th Asia and South Pacific Design Automation Conference
Process-level virtualization for runtime adaptation of embedded software

Proceedings of the 48th Design Automation Conference
A novel chaining approach to indirect control transfer instructions

ARES'11 Proceedings of the IFIP WG 8.4/8.9 international cross domain conference on Availability, reliability and security for business, enterprise and health information systems
An energy-aware whole-system dynamic emulator – skyeye

EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
Memory optimization of dynamic binary translators for embedded systems

ACM Transactions on Architecture and Code Optimization (TACO)
Enabling dynamic binary translation in embedded systems with scratchpad memory

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Dynamic Execution Layer Interface (DELl) offers the following unique capability: it provides fine-grain control over the execution of programs, by allowing its clients to observe and optionally manipulate every single instruction---at run time---just before it runs. DELl accomplishes this by opening up an interface to the layer between the execution of software and hardware. To avoid the slowdown, DELl caches a private copy of the executed code and always runs out of its own private cache.In addition to giving powerful control to clients, DELl opens up caching and linking to ordinary emulators and just-in-time compilers, which then get the reuse benefits of the same mechanism. For example, emulators themselves can also use other clients, to mix emulation with already existing services, native code, and other emulators.This paper describes the basic aspects of DELl, including the underlying caching and linking mechanism, the Hardware Abstraction Mechanism (HAM), the Binary-Level Translation (BLT) infrastructure, and the Application Programming interface (API) exposed to the clients. We also cover some of the services that clients could offer through the DELl, such as ISA emulation, software patching, and sandboxing. Finally, we consider a case study of emulation in detail: the emulation of a PocketPC system on the Lx/ST210 embedded VLIW processor. In this case, DELl enables us to achieve near-native performance, and to mix-and-match native and emulated code.