Data structures and network algorithms
Data structures and network algorithms
Compile-time support for efficient data race detection in shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient detection of determinacy races in Cilk programs
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Detecting data races in Cilk programs that use locks
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Efficiency of a Good But Not Linear Set Union Algorithm
Journal of the ACM (JACM)
Scheduling multithreaded computations by work stealing
Journal of the ACM (JACM)
Proceedings of the ACM 2000 conference on Java Grande
Soot - a Java bytecode optimization framework
CASCON '99 Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Computer
May-happen-in-parallel analysis of X10 programs
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Deadlock-free scheduling of X10 computations with bounded resources
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
SingleTrack: A Dynamic Determinism Checker for Multithreaded Programs
ESOP '09 Proceedings of the 18th European Symposium on Programming Languages and Systems: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
FastTrack: efficient and precise dynamic race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Work-first and help-first scheduling policies for async-finish task parallelism
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
The habanero multicore software research project
Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
The design of a task parallel library
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Featherweight X10: a core calculus for async-finish parallelism
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Parallel programming must be deterministic by default
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Monitor optimization via stutter-equivalent loop transformation
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Clara: a framework for partially evaluating finite-state runtime monitors ahead of time
RV'10 Proceedings of the First international conference on Runtime verification
Intermediate language extensions for parallelism
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Hi-index | 0.00 |
A major productivity hurdle for parallel programming is the presence of data races. Data races can lead to all kinds of harmful program behaviors, including determinism violations and corrupted memory. However, runtime overheads of current dynamic data race detectors are still prohibitively large (often incurring slowdowns of 10脳 or more) for use in mainstream software development.In this paper, we present an efficient dynamic race detection algorithm that handles both the async-finish task-parallel programming model used in languages such as X10 and Habanero Java (HJ) and the spawn-sync constructs used in Cilk.We have implemented our algorithm in a tool called TaskChecker and evaluated it on a suite of 12 benchmarks. To reduce overhead of the dynamic analysis, we have also implemented various static optimizations in the tool. Our experimental results indicate that our approach performs well in practice, incurring an average slowdown of 3.05脳 compared to a serial execution in the optimized case.