Efficient and correct execution of parallel programs that share memory
ACM Transactions on Programming Languages and Systems (TOPLAS)
Detecting data races on weak memory systems
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
DART: directed automated random testing
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
CUTE: a concolic unit testing engine for C
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Static and Precise Detection of Concurrency Errors in Systems Code Using SMT Solvers
CAV '09 Proceedings of the 21st International Conference on Computer Aided Verification
FLAVERS: a finite state verification technique for software systems
IBM Systems Journal
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Scalable SMT-based verification of GPU kernel functions
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
GRace: a low-overhead mechanism for detecting data races in GPU programs
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Symbolic crosschecking of floating-point and SIMD code
Proceedings of the sixth conference on Computer systems
Safe optimisations for shared-memory concurrent programs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Concurrency analysis for parallel programs with textually aligned barriers
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Divergence Analysis and Optimizations
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Symbolic testing of OpenCL code
HVC'11 Proceedings of the 7th international Haifa Verification conference on Hardware and Software: verification and testing
Generation of TLM testbenches using mutation testing
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
GPUVerify: a verifier for GPU kernels
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programs
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Symbolic execution for software testing: three decades later
Communications of the ACM
Symbolic testing of OpenCL code
HVC'11 Proceedings of the 7th international Haifa Verification conference on Hardware and Software: verification and testing
Interleaving and lock-step semantics for analysis and verification of GPU kernels
ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
JST: an automatic test generation tool for industrial Java applications with strings
Proceedings of the 2013 International Conference on Software Engineering
Barrier invariants: a shared state abstraction for the analysis of data-dependent GPU kernels
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
A sound and complete abstraction for reasoning about parallel prefix sums
Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
Proceedings of Workshop on General Purpose Processing Using GPUs
Hi-index | 0.02 |
Programs written for GPUs often contain correctness errors such as races, deadlocks, or may compute the wrong result. Existing debugging tools often miss these errors because of their limited input-space and execution-space exploration. Existing tools based on conservative static analysis or conservative modeling of SIMD concurrency generate false alarms resulting in wasted bug-hunting. They also often do not target performance bugs (non-coalesced memory accesses, memory bank conflicts, and divergent warps). We provide a new framework called GKLEE that can analyze C++ GPU programs, locating the aforesaid correctness and performance bugs. For these programs, GKLEE can also automatically generate tests that provide high coverage. These tests serve as concrete witnesses for every reported bug. They can also be used for downstream debugging, for example to test the kernel on the actual hardware. We describe the architecture of GKLEE, its symbolic virtual machine model, and describe previously unknown bugs and performance issues that it detected on commercial SDK kernels. We describe GKLEE's test-case reduction heuristics, and the resulting scalability improvement for a given coverage target.