Exploiting software interfaces for performance measurement
Proceedings of the 1st international workshop on Software and performance
On using SCALEA for performance analysis of distributed and parallel programs
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Design and Prototype of a Performance Tool Interface for OpenMP
The Journal of Supercomputing
PET, a software monitoring toolkit for performance analysis of parallel embedded applications
Journal of Systems Architecture: the EUROMICRO Journal
The future of multiprocessor systems-on-chips
Proceedings of the 41st annual Design Automation Conference
The OpenMP Source Code Repository
PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Self-Adaptive Fault Tolerance in Multi-/Many-Core Systems
Journal of Electronic Testing: Theory and Applications
Hi-index | 0.00 |
In this paper we present a new technique for automatically measuring the performance of tasks, functions or arbitrary parts of a program on a multiprocessor embedded system. The technique instruments the tasks described by OpenMP, used to represent the task parallelism, while ad hoc pragmas in the source indicate other pieces of code to profile. The annotations and the instrumentation are completely target-independent, so the same code can be measured on different target architectures, on simulators or on prototypes. We validate the approach on a single and on a dual LEON 3 platform synthesized on FPGA, demonstrating a low instrumentation overhead. We show how the information obtained with this technique can be easily exploited in a Hardware/Software design space exploration tool, by estimating, with good accuracy, the speed-up of a parallel application given the profiling on the single processor prototype.