SHARP: an architecture for secure resource peering
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Basic Concepts and Taxonomy of Dependable and Secure Computing
IEEE Transactions on Dependable and Secure Computing
DART: directed automated random testing
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
EXE: automatically generating inputs of death
Proceedings of the 13th ACM conference on Computer and communications security
Emergent (mis)behavior vs. complex software systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Mace: language support for building distributed systems
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
MODIST: transparent model checking of unmodified distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
CrystalBall: predicting and preventing inconsistencies in deployed distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
A survey of online failure prediction methods
ACM Computing Surveys (CSUR)
KleeNet: discovering insidious interaction bugs in wireless sensor networks before deployment
Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
SEPIA: privacy-preserving aggregation of multi-domain network events and statistics
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Life, death, and the critical transition: finding liveness bugs in systems code
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Toward online testing of federated and heterogeneous distributed systems
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Online testing of federated and heterogeneous distributed systems
Proceedings of the ACM SIGCOMM 2011 conference
Hi-index | 0.00 |
We consider the problem of predicting faults in deployed, large-scale distributed systems that are heterogeneous and federated. Motivated by the importance of ensuring reliability of the services these systems provide, we argue that the key step in making these systems reliable is the need to automatically predict faults. For example, doing so is vital for avoiding Internet-wide outages that occur due to programming errors or misconfigurations.