Predicting program behavior using real or estimated profiles
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Using branch handling hardware to support profile-driven optimization
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Optimizing alpha executables on Windows NT with spike
Digital Technical Journal
A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
The Jalapeño dynamic optimizing compiler for Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
A portable sampling-based profiler for Java virtual machines
Proceedings of the ACM 2000 conference on Java Grande
Relational profiling: enabling thread-level parallelism in virtual machines
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Efficient representations and abstractions for quantifying and exploiting data reference locality
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Software profiling for hot path prediction: less is more
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Rapid profiling via stratified sampling
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Dynamic hot data stream prefetching for general-purpose programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Pentium 4 Performance-Monitoring Features
IEEE Micro
Dynamic trace selection using performance monitoring hardware sampling
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
A Programmable Co-processor for Profiling
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A Programmable Hardware Path Profiler
Proceedings of the international symposium on Code generation and optimization
Analyis of Path Profiling Information Generated with Performance Monitoring Hardware
INTERACT '05 Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer Architectures
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
Deterministic Scheduling for Multithreaded Replicas
WORDS '05 Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems
Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications
Proceedings of the International Symposium on Code Generation and Optimization
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance
Proceedings of the International Symposium on Code Generation and Optimization
Detours: binary interception of Win32 functions
WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance
Proceedings of the International Symposium on Code Generation and Optimization
Pipa: pipelined profiling and analysis on multi-core systems
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Fast memory snapshot for concurrent programmingwithout synchronization
Proceedings of the 23rd international conference on Supercomputing
Fast Track: A Software System for Speculative Program Optimization
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Parallelizing calling context profiling in virtual machines on multicores
PPPJ '09 Proceedings of the 7th International Conference on Principles and Practice of Programming in Java
TotalProf: a fast and accurate retargetable source code profiler
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
A concurrent dynamic analysis framework for multicore hardware
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Parallel dynamic analysis on multicores with aspect-oriented programming
Proceedings of the 9th International Conference on Aspect-Oriented Software Development
Improving instrumentation speed via buffering
Proceedings of the Workshop on Binary Instrumentation and Applications
Evaluating the accuracy of Java profilers
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
PiPA: Pipelined profiling and analysis on multicore systems
ACM Transactions on Architecture and Code Optimization (TACO)
SD3: A Scalable Approach to Dynamic Data-Dependence Profiling
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MT-Profiler: a parallel dynamic analysis framework based on two-stage sampling
APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
Hardware performance monitoring for the rest of us: a position and survey
NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
ISAMAP: instruction mapping driven by dynamic binary translation
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Deferred methods: accelerating dynamic program analysis on multicores
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Fast loop-level data dependence profiling
Proceedings of the 26th ACM international conference on Supercomputing
Multi-slicing: a compiler-supported parallel approach to data dependence profiling
Proceedings of the 2012 International Symposium on Software Testing and Analysis
MSEPT'12 Proceedings of the 2012 international conference on Multicore Software Engineering, Performance, and Tools
Visualizing transactional memory
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
HOTL: a higher order theory of locality
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
ShadowData: shadowing heap objects in Java
Proceedings of the 11th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering
ShadowVM: robust and comprehensive dynamic program analysis for the java platform
Proceedings of the 12th international conference on Generative programming: concepts & experiences
DIME: time-aware dynamic binary instrumentation using rate-based resource allocation
Proceedings of the Eleventh ACM International Conference on Embedded Software
Recovering memory access patterns of executable programs
Science of Computer Programming
Hi-index | 0.00 |
In profiling, a tradeoff exists between information and overhead. For example, hardware-sampling profilers incur negligible overhead, but the information they collect is consequently very coarse. Other profilers use instrumentation tools to gather temporal traces such as path profiles and hot memory streams, but they have high overhead. Runtime and feedback-directed compilation systems need detailed information to aggressively optimize, but the cost of gathering profiles can outweigh the benefits. Shadow profiling is a novel method for sampling long traces of instrumented code in parallel with normal execution, taking advantage of the trend of increasing numbers of cores. Each instrumented sample can be many millions of instructions in length. The primary goal is to incur negligible overhead, yet attain profile information that is nearly as accurate as a perfect profile. The profiler requires no modifications to the operating system or hardware, and is tunable to allow for greater coverage or lower overhead. We evaluate the performance and accuracy of this new profiling technique for two common types of instrumentation-based profiles: interprocedural path profiling and value profiling. Overall, profiles collected using the shadow profiling framework are 94% accurate versus perfect value profiles, while incurring less than 1% overhead. Consequently, this technique increases the viability of dynamic and continuous optimization systems by hiding the high overhead of instrumentation and enabling the online collection of many types of profiles that were previously too costly.