ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Continuous profiling: where have all the cycles gone?
ACM Transactions on Computer Systems (TOCS)
A compilation-based software estimation scheme for hardware/software co-simulation
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
A retargetable, ultra-fast instruction set simulator
DATE '99 Proceedings of the conference on Design, automation and test in Europe
EXPRESSION: a language for architecture exploration through compiler/simulator retargetability
DATE '99 Proceedings of the conference on Design, automation and test in Europe
LISA—machine description language for cycle-accurate models of programmable DSP architectures
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A universal technique for fast and flexible instruction-set architecture simulation
Proceedings of the 39th annual Design Automation Conference
Instruction set compiled simulation: a technique for fast and flexible instruction set simulation
Proceedings of the 40th annual Design Automation Conference
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Compilation-based software performance estimation for system level design
HLDVT '00 Proceedings of the IEEE International High-Level Validation and Test Workshop (HLDVT'00)
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Retargetable profiling for rapid, early system-level design space exploration
Proceedings of the 41st annual Design Automation Conference
Link-time optimization of ARM binaries
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
A Programmable Hardware Path Profiler
Proceedings of the international symposium on Code generation and optimization
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Fine-grained application source code profiling for ASIP design
Proceedings of the 42nd annual Design Automation Conference
QEMU: a multihost, multitarget emulator
Linux Journal
Performance analysis through synthetic trace generation
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Shadow Profiling: Hiding Instrumentation Costs with Parallelism
Proceedings of the International Symposium on Code Generation and Optimization
L4oprof: a performance-monitoring-unit-based software-profiling framework for the L4 microkernel
ACM SIGOPS Operating Systems Review
A fast and generic hybrid simulation approach using C virtual machine
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
High-performance timing simulation of embedded software
Proceedings of the 45th annual Design Automation Conference
MAPS: an integrated framework for MPSoC application parallelization
Proceedings of the 45th annual Design Automation Conference
Cycle-approximate retargetable performance estimation at the transaction level
Proceedings of the conference on Design, automation and test in Europe
Source-level timing annotation and simulation for a heterogeneous multiprocessor
Proceedings of the conference on Design, automation and test in Europe
Discovering and Exploiting Program Phases
IEEE Micro
Processor Description Languages
Processor Description Languages
Automatic instrumentation of embedded software for high level hardware/software co-simulation
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Fast and accurate performance simulation of embedded software for MPSoC
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Profilers play an important role in software/hardware design, optimization, and verification. Various approaches have been proposed to implement profilers. The most widespread approach adopted in the embedded domain is Instruction Set Simulation (ISS) based profiling, which provides uncompromised accuracy but limited execution speed. Source code profilers, on the contrary, are fast but less accurate. This paper introduces TotalProf, a fast and accurate source code cross profiler that estimates the performance of an application from three aspects: First, code optimization and a novel virtual compiler backend are employed to resemble the course of target compilation. Second, an optimistic static scheduler is introduced to estimate the behavior of the target processor's datapath. Last but not least, dynamic events, such as cache misses, bus contention and branch prediction failures, are simulated at runtime. With an abstract architecture description, the tool can be easily retargeted in a performance characteristics oriented way to estimate different processor architectures, including DSPs and VLIW machines. Multiple instances of TotalProf can be integrated with SystemC to support heterogeneous Multi-Processor System-on-Chip (MPSoC) profiling. With only about a 5 to 15% error rate introduced to the major performance metrics, such as cycle count, memory accesses and cache misses, a more than one Giga-Instruction-Per-Second (GIPS) execution speed is achieved.