Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Efficient detection of determinacy races in Cilk programs
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS)
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Atomizer: a dynamic atomicity checker for multithreaded programs
Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
A serializability violation detector for shared-memory server programs
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Thread-Shared Software Code Caches
Proceedings of the International Symposium on Code Generation and Optimization
AVIO: detecting atomicity violations via access interleaving invariants
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Accurate and efficient filtering for the Intel thread checker race detector
Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Transactional memory with strong atomicity using off-the-shelf memory protection hardware
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
DMP: deterministic shared memory multiprocessing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
FastTrack: efficient and precise dynamic race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
LiteRace: effective sampling for lightweight data-race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Umbra: efficient and scalable memory shadowing
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
PACER: proportional detection of data races
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Efficient memory shadowing for 64-bit architectures
Proceedings of the 2010 international symposium on Memory management
Effective data-race detection for the kernel
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Dthreads: efficient deterministic multithreading
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Toward a formal semantic framework for deterministic parallel programming
DISC'11 Proceedings of the 25th international conference on Distributed computing
Benchmarking modern multiprocessors
Benchmarking modern multiprocessors
Practical memory checking with Dr. Memory
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
RADISH: always-on sound and complete Ra Detection in Software and Hardware
Proceedings of the 39th Annual International Symposium on Computer Architecture
IFRit: interference-free regions for dynamic data-race detection
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Proceedings of the 8th ACM European Conference on Computer Systems
OCTET: capturing and controlling cross-thread dependences efficiently
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Low-level detection of language-level data races with LARD
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the lack of hardware support for inspecting the complex behavior of these parallel programs. Inter-thread communication, which must be instrumented for many types of analyses, may occur with any memory operation. To detect such thread communication in software, many existing tools require the instrumentation of all memory operations, which leads to significant performance overheads. To reduce this overhead, some existing tools resort to random sampling of memory operations, which introduces false negatives. Unfortunately, neither of these approaches provide the speed and accuracy programmers have traditionally expected from their tools. In this work, we present Aikido, a new system and framework that enables the development of efficient and transparent analyses that operate on shared data. Aikido uses a hybrid of existing hardware features and dynamic binary rewriting to detect thread communication with low overhead. Aikido runs a custom hypervisor below the operating system, which exposes per-thread hardware protection mechanisms not available in any widely used operating system. This hybrid approach allows us to benefit from the low cost of detecting memory accesses with hardware, while maintaining the word-level accuracy of a software-only approach. To evaluate our framework, we have implemented an Aikido-enabled vector clock race detector. Our results show that the Aikido enabled race-detector outperforms existing techniques that provide similar accuracy by up to 6.0x, and 76% on average, on the PARSEC benchmark suite.