Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Detecting violations of sequential consistency
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
Orca: A Language for Parallel Programming of Distributed Systems
IEEE Transactions on Software Engineering
The Stanford Dash Multiprocessor
Computer
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
OOPSLA '01 Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Efficient and precise datarace detection for multithreaded object-oriented programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Lock reservation: Java locks can mostly do without atomic operations
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Ownership types for safe programming: preventing data races and deadlocks
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Static conflict analysis for multi-threaded object-oriented programs
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Software cache coherence for large scale multiprocessors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Runtime Analysis of Atomicity for Multithreaded Programs
IEEE Transactions on Software Engineering
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Eliminating synchronization-related atomic operations with biased locking and bulk rebiasing
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Atomicity via source-to-source translation
Proceedings of the 2006 workshop on Memory system performance and correctness
Conditional must not aliasing for static race detection
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Goldilocks: a race and transaction-aware java runtime
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
How to shadow every byte of memory used by a program
Proceedings of the 3rd international conference on Virtual execution environments
TreadMarks: distributed shared memory on standard workstations and operating systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
DMP: deterministic shared memory multiprocessing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Two hardware-based approaches for deterministic multiprocessor replay
Communications of the ACM - One Laptop Per Child: Vision vs. Reality
FastTrack: efficient and precise dynamic race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
CoreDet: a compiler and runtime system for deterministic multithreaded execution
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Memory models: a case for rethinking parallel languages and hardware
Communications of the ACM
DRFX: a simple and efficient memory model for concurrent programming languages
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
The RoadRunner Dynamic Analysis Framework for Concurrent Programs
Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Proceedings of the 37th annual international symposium on Computer architecture
Transactional Memory, 2nd Edition
Transactional Memory, 2nd Edition
Parallel programming must be deterministic by default
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
A case for system support for concurrency exceptions
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Dthreads: efficient deterministic multithreading
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Why nothing matters: the impact of zeroing
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Aikido: accelerating shared data dynamic analyses
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Chimera: hybrid program analysis for determinism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Barriers reconsidered, friendlier still!
Proceedings of the 2012 international symposium on Memory Management
A black-box approach to understanding concurrency in DaCapo
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Hi-index | 0.00 |
Parallel programming is essential for reaping the benefits of parallel hardware, but it is notoriously difficult to develop and debug reliable, scalable software systems. One key challenge is that modern languages and systems provide poor support for ensuring concurrency correctness properties - atomicity, sequential consistency, and multithreaded determinism - because all existing approaches are impractical. Dynamic, software-based approaches slow programs by up to an order of magnitude because capturing and controlling cross-thread dependences (i.e., conflicting accesses to shared memory) requires synchronization at virtually every access to potentially shared memory. This paper introduces a new software-based concurrency control mechanism called OCTET that soundly captures cross-thread dependences and can be used to build dynamic analyses for concurrency correctness. OCTET achieves low overheads by tracking the locality state of each potentially shared object. Non-conflicting accesses conform to the locality state and require no synchronization; only conflicting accesses require a state change and heavyweight synchronization. This optimistic tradeoff leads to significant efficiency gains in capturing cross-thread dependences: a prototype implementation of OCTET in a high-performance Java virtual machine slows real-world concurrent programs by only 26% on average. A dependence recorder, suitable for record & replay, built on top of OCTET adds an additional 5% overhead on average. These results suggest that OCTET can provide a foundation for developing low-overhead analyses that check and enforce concurrency correctness.