Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
HDTrans: a low-overhead dynamic translator
ACM SIGARCH Computer Architecture News
Using Valgrind to detect undefined value errors with bit-precision
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications
Proceedings of the International Symposium on Code Generation and Optimization
ATOM: a flexible interface for building high performance program analysis tools
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Modeling optimistic concurrency using quantitative dependence analysis
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Process-shared and persistent code caches
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Irrevocable transactions and their applications
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Detecting and tolerating asymmetric races
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Transactional memory with strong atomicity using off-the-shelf memory protection hardware
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
QuakeTM: parallelizing a complex sequential application using transactional memory
Proceedings of the 23rd international conference on Supercomputing
Optimizing transactions for captured memory
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Generating low-overhead dynamic binary translators
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Exploring the limits of disjoint access parallelism
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Hi-index | 0.00 |
Memory access instrumentation is fundamental to many applications such as software transactional memory systems, profiling tools and race detectors. We examine the problem of efficiently instrumenting memory accesses in x86 machine code to support software transactional memory and profiling. We aim to automatically instrument all shared memory accesses in critical sections of x86 binaries, while achieving overhead close to that obtained when performing manual instrumentation at the source code level. The two primary options in building such an instrumentation system are static and dynamic binary rewriting: the former instruments binaries at link time before execution, while the latter binary rewriting instruments binaries at runtime. Static binary rewriting offers extremely low overhead but is hampered by the limits of static analysis. Dynamic binary rewriting is able to use runtime information but typically incurs higher overhead. This paper proposes an alternative: hybrid binary rewriting. Hybrid binary rewriting is built around the idea of a persistent instrumentation cache (PIC) that is associated with a binary and contains instrumented code from it. It supports two execution modes when using instrumentation: active and passive modes. In the active execution mode, a dynamic binary rewriting engine (PIN) is used to intercept execution, and generate instrumentation into the PIC, which is an on-disk file. This execution mode can take full advantage of runtime information. Later, passive execution can be used where instrumented code is executed out of the PIC. This allows us to attain overheads similar to those incurred with static binary rewriting. This instrumentation methodology enables a variety of static and dynamic techniques to be applied. For example, in passive mode, execution occurs directly from the original executable save for regions that require instrumentation. This has allowed us to build a low-overhead transactional memory profiler. We also demonstrate how we can use the combination of static and dynamic techniques to eliminate instrumentation for accesses to locations that are thread-private.