Efficient Instrumentation of GPGPU Applications Using Information Flow Analysis and Symbolic Execution

Authors:
Naila Farooqui;Karsten Schwan;Sudhakar Yalamanchili
Affiliations:
College of Computing, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA;College of Computing, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA;College of Computing, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA
Venue:
Proceedings of Workshop on General Purpose Processing Using GPUs
Year:
2014

Citing 12
Cited 0

LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A characterization and analysis of PTX kernels

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Rodinia: A benchmark suite for heterogeneous computing

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
An adaptive performance modeling tool for GPU architectures

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications

IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Caracal: dynamic translation of runtime environments for GPUs

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A framework for dynamically instrumenting GPU compute applications within GPU Ocelot

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A quantitative performance analysis model for GPU architectures

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
GKLEE: concolic verification and test generation for GPUs

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Lynx: A dynamic instrumentation system for data-parallel applications on GPGPU architectures

ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software
Portable mapping of data parallel programs to OpenCL for heterogeneous systems

CGO '13 Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dynamic instrumentation of GPGPU binaries makes possible real-time introspection methods for performance debugging, correctness checks, workload characterization, and runtime optimization. Such instrumentation involves inserting code at the instruction level of an application, while the application is running, thereby able to accurately profile data-dependent application behavior. Runtime overheads seen from instrumentation, however, can obviate its utility. This paper shows how a combination of information flow analysis and symbolic execution can be used to alleviate these overheads. The methods and their effectiveness are demonstrated for a variety of GPGPU codes written in OpenCL that run on AMD GPU target backends. Kernels that can be analyzed entirely via symbolic execution need not be instrumented, thus eliminating kernel runtime overheads altogether. For the remaining GPU kernels, our results show 5-38% improvements in kernel runtime overheads.