Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Using finite experiments to study asymptotic performance
Experimental algorithmics
Fast, accurate call graph profiling
Software—Practice & Experience
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Practical Path Profiling for Dynamic Optimizers
Proceedings of the international symposium on Code generation and optimization
Operating Systems Design and Implementation (3rd Edition)
Operating Systems Design and Implementation (3rd Edition)
Accurate, efficient, and adaptive calling context profiling
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Preferential path profiling: compactly numbering interesting paths
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Using Valgrind to detect undefined value errors with bit-precision
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Fault Detection in Multi-Threaded C++ Server Applications
Electronic Notes in Theoretical Computer Science (ENTCS)
Measuring empirical computational complexity
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Call path profiling of monotonic program resources in UNIX
Usenix-stc'93 Proceedings of the USENIX Summer 1993 Technical Conference on Summer technical conference - Volume 1
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
SPEC OMP2012 -- an application benchmark suite for parallel systems using OpenMP
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
A black-box approach to understanding concurrency in DaCapo
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Hi-index | 0.00 |
A crucial aspect in software development is understanding how an application's performance scales as a function of its input data. Estimating the empirical cost function of individual routines of a program can help developers predict the runtime on larger workloads and pinpoint asymptotic inefficiencies in the code. While this has been the target of extensive research in performance profiling, a major limitation of state-of-the-art approaches is that the input size is assumed to be determinable from the program's state prior to the invocation of the routine to be profiled, failing to characterize the scenario where routines dynamically receive input values during their activations. This results in missing workloads generated by kernel system calls (e.g., in response to I/O or network operations) or by other threads, which play a crucial role in modern concurrent and interactive applications. Measuring dynamic workloads poses several challenges, requiring shared-memory communication between threads to be efficiently traced. In this paper we present a new metric and an efficient algorithm for automatically estimating the size of the input of each routine activation. We provide examples showing that our metric allows the estimation of the empirical cost functions of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmarks shows that our metric can be integrated in a Valgrind-based profiler incurring overheads comparable to other prominent heavyweight dynamic analysis tools.