COSMOS: a compiled simulator for MOS circuits
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
Inaccuracies in program profilers
Software—Practice & Experience
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Edge profiling versus path profiling: the showdown
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Efficiency of a Good But Not Linear Set Union Algorithm
Journal of the ACM (JACM)
A portable sampling-based profiler for Java virtual machines
Proceedings of the ACM 2000 conference on Java Grande
Efficient performance prediction for modern microprocessors
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
IEEE Transactions on Software Engineering
Interprocedural Path Profiling
CC '99 Proceedings of the 8th International Conference on Compiler Construction, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'99
Using finite experiments to study asymptotic performance
Experimental algorithmics
Targeted Path Profiling: Lower Overhead Path Profiling for Staged Dynamic Optimization Systems
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Fast, accurate call graph profiling
Software—Practice & Experience
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Practical Path Profiling for Dynamic Optimizers
Proceedings of the international symposium on Code generation and optimization
Low-overhead call path profiling of unmodified, optimized code
Proceedings of the 19th annual international conference on Supercomputing
Accurate, efficient, and adaptive calling context profiling
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Preferential path profiling: compactly numbering interesting paths
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Using Valgrind to detect undefined value errors with bit-precision
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Measuring empirical computational complexity
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Communications of the ACM
The worst-case execution-time problem—overview of methods and survey of tools
ACM Transactions on Embedded Computing Systems (TECS)
Call path profiling of monotonic program resources in UNIX
Usenix-stc'93 Proceedings of the USENIX Summer 1993 Technical Conference on Summer technical conference - Volume 1
Decoupling dynamic program analysis from execution in virtual environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Profiling k-Iteration Paths: A Generalization of the Ball-Larus Profiling Algorithm
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Computer Systems: A Programmer's Perspective
Computer Systems: A Programmer's Perspective
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Mining hot calling contexts in small space
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Context-sensitive delta inference for identifying workload-dependent performance bottlenecks
Proceedings of the 2013 International Symposium on Software Testing and Analysis
Toddler: detecting performance problems via similar memory-access patterns
Proceedings of the 2013 International Conference on Software Engineering
Discovering, reporting, and fixing performance bugs
Proceedings of the 10th Working Conference on Mining Software Repositories
Estimating the Empirical Cost Function of Routines with Dynamic Workloads
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
In this paper we present a profiling methodology and toolkit for helping developers discover hidden asymptotic inefficiencies in the code. From one or more runs of a program, our profiler automatically measures how the performance of individual routines scales as a function of the input size, yielding clues to their growth rate. The output of the profiler is, for each executed routine of the program, a set of tuples that aggregate performance costs by input size. The collected profiles can be used to produce performance plots and derive trend functions by statistical curve fitting or bounding techniques. A key feature of our method is the ability to automatically measure the size of the input given to a generic code fragment: to this aim, we propose an effective metric for estimating the input size of a routine and show how to compute it efficiently. We discuss several case studies, showing that our approach can reveal asymptotic bottlenecks that other profilers may fail to detect and characterize the workload and behavior of individual routines in the context of real applications. To prove the feasibility of our techniques, we implemented a Valgrind tool called aprof and performed an extensive experimental evaluation on the SPEC CPU2006 benchmarks. Our experiments show that aprof delivers comparable performance to other prominent Valgrind tools, and can generate informative plots even from single runs on typical workloads for most algorithmically-critical routines.