Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Nesting of reducible and irreducible loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Software profiling for hot path prediction: less is more
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
TEST: a tracer for extracting speculative threads
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Control Speculation in Multithreaded Processors through Dynamic Loop Detection
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Selecting Software Phase Markers with Code Structure Analysis
Proceedings of the International Symposium on Code Generation and Optimization
Compilers: Principles, Techniques, and Tools (2nd Edition)
Compilers: Principles, Techniques, and Tools (2nd Edition)
Identifying potential parallelism via loop-centric profiling
Proceedings of the 4th international conference on Computing frontiers
Computer
Run-time Detection Mechanism of Nested Call-loop Structure to Monitor the Actual Execution of Codes
STFSSD '09 Proceedings of the 2009 Software Technologies for Future Dependable Distributed Systems
Towards automatic program partitioning
Proceedings of the 6th ACM conference on Computing frontiers
Binary analysis for measurement and attribution of program performance
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Characterization of DBT overhead
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
TAO: two-level atomicity for dynamic binary optimizations
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Loop selection for thread-level speculation
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Loop transformation recipes for code generation and auto-tuning
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
Loop structures in programs have been regarded as a primary source of finding parallelism from sequential codes. In this paper, we present a new technique that dynamically detects precise loop structures with their inter-procedural nests on a dynamic binary translation system. Using precompiled application binary code as an input, our mechanism generates the simple but precise markers when they are loaded from their binary code image, and at runtime monitors loop structures with inter-procedural nesting on the fly using Loop-Call Context Graph. We implement our mechanism and evaluate it using SPEC CPU benchmark suite. The results show that our mechanism reveals precise loop structures with interprocedural loop nesting successfully. The results also show that ours can reduce overheads for loop analysis compared with the existing ones. These indicate that our mechanism can be applied to runtime optimization and parallelization as well as hints for performance tuning.