Aikido: accelerating shared data dynamic analyses

Authors:
Marek Olszewski;Qin Zhao;David Koh;Jason Ansel;Saman Amarasinghe
Affiliations:
Massachusetts Institute of Technology, Cambridge, MA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA
Venue:
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Year:
2012

Citing 31
Cited 5

Debugging Parallel Programs with Instant Replay

IEEE Transactions on Computers
Efficient detection of determinacy races in Cilk programs

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Eraser: a dynamic data race detector for multithreaded programs

ACM Transactions on Computer Systems (TOCS)
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
An infrastructure for adaptive dynamic optimization

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Atomizer: a dynamic atomicity checker for multithreaded programs

Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
A serializability violation detector for shared-memory server programs

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Thread-Shared Software Code Caches

Proceedings of the International Symposium on Code Generation and Optimization
AVIO: detecting atomicity violations via access interleaving invariants

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Accurate and efficient filtering for the Intel thread checker race detector

Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Dynamic instrumentation of production systems

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Execution replay of multiprocessor virtual machines

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Overshadow: a virtualization-based approach to retrofitting protection in commodity operating systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Transactional memory with strong atomicity using off-the-shelf memory protection hardware

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
DMP: deterministic shared memory multiprocessing

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
FastTrack: efficient and precise dynamic race detection

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
LiteRace: effective sampling for lightweight data-race detection

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Umbra: efficient and scalable memory shadowing

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
PACER: proportional detection of data races

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Efficient memory shadowing for 64-bit architectures

Proceedings of the 2010 international symposium on Memory management
Effective data-race detection for the kernel

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Dthreads: efficient deterministic multithreading

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Toward a formal semantic framework for deterministic parallel programming

DISC'11 Proceedings of the 25th international conference on Distributed computing
Benchmarking modern multiprocessors

Benchmarking modern multiprocessors
Practical memory checking with Dr. Memory

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization

RADISH: always-on sound and complete Ra Detection in Software and Hardware

Proceedings of the 39th Annual International Symposium on Computer Architecture
IFRit: interference-free regions for dynamic data-race detection

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Whose cache line is it anyway?: operating system support for live detection and repair of false sharing

Proceedings of the 8th ACM European Conference on Computer Systems
OCTET: capturing and controlling cross-thread dependences efficiently

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Low-level detection of language-level data races with LARD

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the lack of hardware support for inspecting the complex behavior of these parallel programs. Inter-thread communication, which must be instrumented for many types of analyses, may occur with any memory operation. To detect such thread communication in software, many existing tools require the instrumentation of all memory operations, which leads to significant performance overheads. To reduce this overhead, some existing tools resort to random sampling of memory operations, which introduces false negatives. Unfortunately, neither of these approaches provide the speed and accuracy programmers have traditionally expected from their tools. In this work, we present Aikido, a new system and framework that enables the development of efficient and transparent analyses that operate on shared data. Aikido uses a hybrid of existing hardware features and dynamic binary rewriting to detect thread communication with low overhead. Aikido runs a custom hypervisor below the operating system, which exposes per-thread hardware protection mechanisms not available in any widely used operating system. This hybrid approach allows us to benefit from the low cost of detecting memory accesses with hardware, while maintaining the word-level accuracy of a software-only approach. To evaluate our framework, we have implemented an Aikido-enabled vector clock race detector. Our results show that the Aikido enabled race-detector outperforms existing techniques that provide similar accuracy by up to 6.0x, and 76% on average, on the PARSEC benchmark suite.