Performance tuning with instruction-level cost derived from call-stack sampling

Authors:
Michael Dunlavey
Affiliations:
Pharsight Corporation, Needham, MA
Venue:
ACM SIGPLAN Notices
Year:
2007

Citing 4
Cited 1

Exploiting hardware performance counters with flow and context sensitive profiling

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Building Better Applications: A Theory of Efficient Software Development

Building Better Applications: A Theory of Efficient Software Development
gprof: a call graph execution profiler

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Low-overhead call path profiling of unmodified, optimized code

Proceedings of the 19th annual international conference on Supercomputing

Dynamic Look Ahead Compilation: A Technique to Hide JIT Compilation Latencies in Multicore Environment

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009

Quantified Score

Hi-index	0.00

Visualization

Abstract

Except for program-counter histogramming, most modern profiling tools summarize at the level of entire functions or basic blocks, with or without additional information such as calling context or call graphs. This paper explicates the value of information about the cost of specific instructions, relative to summaries that do not include it. A good source of this information is time-random sampling of the call stack. To get the diagnostic benefit of instruction costs it is not necessary to measure them with high precision or efficiency. In fact, manual sampling suffices quite well, when it can be used. Other benefits of call stack sampling are that it can be used with unmodified software and libraries, and it is easily confined to the time intervals of interest. As with other profiling techniques, it can be employed repeatedly to remove all significant performance problems in single-thread programs.