Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
A portable sampling-based profiler for Java virtual machines
Proceedings of the ACM 2000 conference on Java Grande
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
ECOOP '01 Proceedings of the 15th European Conference on Object-Oriented Programming
Fast, accurate call graph profiling
Software—Practice & Experience
IBM Systems Journal
Garbage-first garbage collection
Proceedings of the 4th international symposium on Memory management
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Accurate, efficient, and adaptive calling context profiling
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Shadow Profiling: Hiding Instrumentation Costs with Parallelism
Proceedings of the International Symposium on Code Generation and Optimization
Advanced Java bytecode instrumentation
Proceedings of the 5th international symposium on Principles and practice of programming in Java
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Parallel generational-copying garbage collection with a block-structured heap
Proceedings of the 7th international symposium on Memory management
Aspect weaving in standard Java class libraries
Proceedings of the 6th international symposium on Principles and practice of programming in Java
Platform-independent profiling in a virtual execution environment
Software—Practice & Experience
CCCP: complete calling context profiling in virtual execution environments
Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation
Flexible calling context reification for aspect-oriented programming
Proceedings of the 8th ACM international conference on Aspect-oriented software development
Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
Parallel dynamic analysis on multicores with aspect-oriented programming
Proceedings of the 9th International Conference on Aspect-Oriented Software Development
Composition of dynamic analysis aspects
GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
Comprehensive aspect weaving for Java
Science of Computer Programming
Deferred methods: accelerating dynamic program analysis on multicores
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
The Calling Context Tree (CCT) is a prevailing datastructure for calling context profiling. As generating a complete CCT reflecting every method call is expensive, recent research has focused on efficiently approximating the CCT with sampling techniques. However, for tasks such as debugging, testing, and reverse engineering, complete and accurate CCTs are often needed. In this paper, we introduce the ParCCT, a novel approach to parallelizing application code and CCT generation on multicores. Each thread maintains a shadow stack and generates "packets" of method calls and returns that correspond to partial CCTs. Each packet includes a copy of the shadow stack, indicating the calling context of the first method call in the packet. Hence, packets are independent of each other and can be processed out-of-order and in parallel in order to update the CCT. Our portable and extensible implementation targets standard Java Virtual Machines, thanks to instrumentation techniques that ensure complete bytecode coverage and efficiently support custom calling context representations. The ParCCT is more than 110% faster than a primitive, non-parallel approach to CCT construction, when more than two cores are available. This speedup stems both from reduced contention and from parallelization.