Model checking for programming languages using VeriSoft
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ACM Transactions on Computer Systems (TOCS)
Yesterday, my program worked. Today, it does not. Why?
ESEC/FSE-7 Proceedings of the 7th European software engineering conference held jointly with the 7th ACM SIGSOFT international symposium on Foundations of software engineering
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Bugs as deviant behavior: a general approach to inferring errors in systems code
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SplitStream: high-bandwidth multicast in cooperative environments
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Vivaldi: a decentralized network coordinate system
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
CMC: a pragmatic approach to model checking real code
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Scalability and accuracy in a large-scale network emulator
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
DART: directed automated random testing
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Locating faults through automated predicate switching
Proceedings of the 28th international conference on Software engineering
Maintaining high bandwidth under dynamic network conditions
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Mace: language support for building distributed systems
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Iterative context bounding for systematic testing of multithreaded programs
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Overcast: reliable multicasting with on overlay network
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Using random subsets to build scalable network services
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Pip: detecting the unexpected in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Measuring empirical computational complexity
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
MODIST: transparent model checking of unmodified distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Life, death, and the critical transition: finding liveness bugs in systems code
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
WiDS checker: combating bugs in distributed systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
X-trace: a pervasive network tracing framework
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Friday: global comprehension for distributed replay
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
InContext: simple parallelism for distributed applications
Proceedings of the 20th international symposium on High performance distributed computing
Structured comparative analysis of systems logs to diagnose performance problems
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Understanding and detecting real-world performance bugs
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Composable reliability for asynchronous systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Programming model support for dependable, elastic cloud applications
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Toddler: detecting performance problems via similar memory-access patterns
Proceedings of the 2013 International Conference on Software Engineering
Discovering, reporting, and fixing performance bugs
Proceedings of the 10th Working Conference on Mining Software Repositories
Aspen trees: balancing data center fault tolerance, scalability and cost
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Hi-index | 0.00 |
Robust distributed systems commonly employ high-level recovery mechanisms enabling the system to recover from a wide variety of problematic environmental conditions such as node failures, packet drops and link disconnections. Unfortunately, these recovery mechanisms also effectively mask additional serious design and implementation errors, disguising them as latent performance bugs that severely degrade end-to-end system performance. These bugs typically go unnoticed due to the challenge of distinguishing between a bug and an intermittent environmental condition that must be tolerated by the system. We present techniques that can automatically pinpoint latent performance bugs in systems implementations, in the spirit of recent advances in model checking by systematic state space exploration. The techniques proceed by automating the process of conducting random simulations, identifying performance anomalies, and analyzing anomalous executions to pinpoint the circumstances leading to performance degradation. By focusing our implementation on the MACE toolkit, MACEPC can be used to test our implementations directly, without modification. We have applied MACEPC to five thoroughly tested and trusted distributed systems implementations. MACEPC was able to find significant, previously unknown, long-standing performance bugs in each of the systems, and led to fixes that significantly improved the end-to-end performance of the systems.