Scalable data race detection for partitioned global address space programs

  • Authors:
  • Chang-Seo Park;Koushik Sen;Costin Iancu

  • Affiliations:
  • University of California Berkeley, Berkeley, CA, USA;University of California Berkeley, Berkeley, CA, USA;Lawrence Berkeley National Laboratory, Berkeley, CA, USA

  • Venue:
  • Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Contemporary and future programming languages for HPC promote hybrid parallelism and shared memory abstractions using a global address space. In this programming style, data races occur easily and are notoriously hard to find. Previous work on data race detection for shared memory programs reports 10X-100X slowdowns for non-scientific programs. Previous work on distributed memory programs instruments only communication operations. In this paper we present the first complete implementation of data race detection at scale for UPC programs. Our implementation tracks local and global memory references in the program and it uses two techniques to reduce the overhead: 1) hierarchical function and instruction level sampling; and 2) exploiting the runtime persistence of aliasing and locality specific to Partitioned Global Address Space applications. The results indicate that both techniques are required in practice: well optimized instruction sampling introduces overheads as high as 6500% (65X slowdown), while each technique in separation is able to reduce it to 1000% (10X slowdown). When applying the optimizations in conjunction our tool finds all previously known data races in our benchmark programs with at most 50% overhead. Furthermore, while previous results illustrate the benefits of function level sampling, our experiences show that this technique does not work for scientific programs: instruction sampling or a hybrid approach is required.