On-the-fly detection of access anomalies
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Model checking for programming languages using VeriSoft
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS)
Distributed systems (2nd Ed.)
RecPlay: a fully integrated practical record/replay system
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Efficient and precise datarace detection for multithreaded object-oriented programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Ownership types for safe programming: preventing data races and deadlocks
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Efficient on-the-fly data race detection in multithreaded C++ programs
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Building Diverse Computer Systems
HOTOS '97 Proceedings of the 6th Workshop on Hot Topics in Operating Systems (HotOS-VI)
ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
Proceedings of the 30th annual international symposium on Computer architecture
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
RacerX: effective, static detection of race conditions and deadlocks
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Countering code-injection attacks with instruction-set randomization
Proceedings of the 10th ACM conference on Computer and communications security
Randomized instruction set emulation to disrupt binary code injection attacks
Proceedings of the 10th ACM conference on Computer and communications security
A serializability violation detector for shared-memory server programs
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Speculative execution in a distributed file system
Proceedings of the twentieth ACM symposium on Operating systems principles
RaceTrack: efficient detection of data race conditions via adaptive tracking
Proceedings of the twentieth ACM symposium on Operating systems principles
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
DieHard: probabilistic memory safety for unsafe languages
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Accurate and efficient filtering for the Intel thread checker race detector
Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Automatically classifying benign and harmful data races using replay analysis
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Exploring failure transparency and the limits of generic recovery
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Microreboot — A technique for cheap recovery
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
N-variant systems: a secretless framework for security through diversity
USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
The N-Version Approach to Fault-Tolerant Software
IEEE Transactions on Software Engineering
Learning from mistakes: a comprehensive study on real world concurrency bug characteristics
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Proceedings of the 4th ACM European conference on Computer systems
FastTrack: efficient and precise dynamic race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
LiteRace: effective sampling for lightweight data-race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A case for an interleaving constrained shared-memory multi-processor
Proceedings of the 36th annual international symposium on Computer architecture
Grace: safe multithreaded programming for C/C++
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
A type and effect system for deterministic parallel Java
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
The use of triple-modular redundancy to improve computer reliability
IBM Journal of Research and Development
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Orthrus: efficient software integrity protection on multi-cores
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ThreadSanitizer: data race detection in practice
Proceedings of the Workshop on Binary Instrumentation and Applications
PACER: proportional detection of data races
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Gadara: dynamic deadlock avoidance for multithreaded programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Deadlock immunity: enabling systems to defend against deadlocks
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Bypassing races in live applications with execution filters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Effective data-race detection for the kernel
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Finding complex concurrency bugs in large multi-threaded applications
Proceedings of the sixth conference on Computer systems
Tightlip: keeping applications from spilling the beans
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Data races vs. data race bugs: telling the difference with portend
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Execution privatization for scheduler-oblivious concurrent programs
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Automated concurrency-bug fixing
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
X-ray: automating root-cause diagnosis of performance anomalies in production software
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Parallelizing data race detection
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Safe software updates via multi-version execution
Proceedings of the 2013 International Conference on Software Engineering
Hi-index | 0.00 |
Data races are a common source of errors in multithreaded programs. In this paper, we show how to protect a program from data race errors at runtime by executing multiple replicas of the program with complementary thread schedules. Complementary schedules are a set of replica thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Our system, called Frost, uses complementary schedules to cause at least one replica to avoid the order of racing instructions that leads to incorrect program execution for most harmful data races. Frost introduces outcome-based race detection, which detects data races by comparing the state of replicas executing complementary schedules. We show that this method is substantially faster than existing dynamic race detectors for unmanaged code. To help programs survive bugs in production, Frost also diagnoses the data race bug and selects an appropriate recovery strategy, such as choosing a replica that is likely to be correct or executing more replicas to gather additional information. Frost controls the thread schedules of replicas by running all threads of a replica non-preemptively on a single core. To scale the program to multiple cores, Frost runs a third replica in parallel to generate checkpoints of the program's likely future states --- these checkpoints let Frost divide program execution into multiple epochs, which it then runs in parallel. We evaluate Frost using 11 real data race bugs in desktop and server applications. Frost both detects and survives all of these data races. Since Frost runs three replicas, its utilization cost is 3x. However, if there are spare cores to absorb this increased utilization, Frost adds only 3--12% overhead to application runtime.