The design and implementation of a log-structured file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Measurements of a distributed file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
Yesterday, my program worked. Today, it does not. Why?
ESEC/FSE-7 Proceedings of the 7th European software engineering conference held jointly with the 7th ACM SIGSOFT international symposium on Foundations of software engineering
Deciding when to forget in the Elephant file system
Proceedings of the seventeenth ACM symposium on Operating systems principles
Efficient algorithms for bidirectional debugging
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
MyLifeBits: fulfilling the Memex vision
Proceedings of the tenth ACM international conference on Multimedia
Peabody: The Time Travelling Disk
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Research Issues in No-Futz Computing
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
When Virtual Is Better Than Real
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Apache Cookbook
Experience with Evaluating Human-Assisted Recovery Processes
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
An Analysis of RPM Validation Drift
LISA '02 Proceedings of the 16th USENIX conference on System administration
Timing the Application of Security Patches for Optimal Uptime
LISA '02 Proceedings of the 16th USENIX conference on System administration
STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support
LISA '03 Proceedings of the 17th USENIX conference on System administration
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Metadata Efficiency in Versioning File Systems
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Undo for operators: building an undoable e-mail store
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Using computers to diagnose computer problems
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Constructing services with interposable virtual hardware
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Auto-diagnosis of field problems in an appliance operating system
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Jockey: a user-space library for record-replay debugging
Proceedings of the sixth international symposium on Automated analysis-driven debugging
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Automated known problem diagnosis with event traces
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Using queries for distributed monitoring and forensics
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Facilitating the development of soft devices
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Are virtual machine monitors microkernels done right?
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Parallax: managing storage for a million machines
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
RegColl: centralized registry framework for infrastructure system management
LISA '05 Proceedings of the 19th conference on Large Installation System Administration Conference - Volume 19
Automatic configuration of internet services
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
AutoBash: improving configuration management with operating system causality analysis
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
DejaView: a personal virtual computer recorder
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Flight data recorder: monitoring persistent-state interactions to improve systems management
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Towards Scheduling Virtual Machines Based On Direct User Input
VTDC '06 Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing
Stealthy malware detection through vmm-based "out-of-the-box" semantic view reconstruction
Proceedings of the 14th ACM conference on Computer and communications security
Diagnosing misconfiguration with dynamic detection of configuration invariants
HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
Virtual machine time travel using continuous data protection and checkpointing
ACM SIGOPS Operating Systems Review
Parallax: virtual disks for virtual machines
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Why did my pc suddenly slow down?
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
SPIKE: best practice generation for storage area networks
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
SWEEPER: an efficient disaster recovery point identification mechanism
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Decoupling dynamic program analysis from execution in virtual environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Fast, inexpensive content-addressed storage in foundation
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Using causality to diagnose configuration bugs
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Towards automatic reverse engineering of software security configurations
Proceedings of the 15th ACM conference on Computer and communications security
Automatic software fault diagnosis by exploiting application signatures
LISA'08 Proceedings of the 22nd conference on Large installation system administration conference
Males' and Females' Script Debugging Strategies
IS-EUD '09 Proceedings of the 2nd International Symposium on End-User Development
VCONF: a reinforcement learning approach to virtual machines auto-configuration
ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Detailed diagnosis in enterprise networks
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ACM Transactions on Information and System Security (TISSEC)
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Detecting the performance impact of upgrades in large operational networks
Proceedings of the ACM SIGCOMM 2010 conference
Automatically generating predicates and solutions for configuration troubleshooting
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Automating configuration troubleshooting with dynamic information flow analysis
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
MassConf: automatic configuration tuning by leveraging user community information
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Repair from a chair: computer repair as an untrusted cloud service
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Context-based online configuration-error detection
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
An empirical study on configuration errors in commercial and open source systems
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
URL: A unified reinforcement learning approach for autonomic cloud management
Journal of Parallel and Distributed Computing
Precomputing possible configuration error diagnoses
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
X-ray: automating root-cause diagnosis of performance anomalies in production software
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Brief announcement: techniques for programmatically troubleshooting distributed systems
Proceedings of the 2013 ACM symposium on Principles of distributed computing
Automated diagnosis of software configuration errors
Proceedings of the 2013 International Conference on Software Engineering
ConfDiagnoser: an automated configuration error diagnosis tool for Java software
Proceedings of the 2013 International Conference on Software Engineering
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Do not blame users for misconfigurations
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.00 |
This work addresses the problem of diagnosing configuration errors that cause a system to function incorrectly. For example, a change to the local firewall policy could cause a network-based application to malfunction. Our approach is based on searching across time for the instant the system transitioned into a failed state. Based on this information, a troubleshooter or administrator can deduce the cause of failure by comparing system state before and after the failure. We present the Chronus tool, which automates the task of searching for a failure-inducing state change. Chronus takes as input a user-provided software probe, which differentiates between working and nonworking states. Chronus performs "time travel" by booting a virtual machine off the system's disk state as it existed at some point in the past. By using binary search, Chronus can find the fault point with effort that grows logarithmically with log size. We demonstrate that Chronus can diagnose a range of common configuration errors for both client-side and server-side applications, and that the performance overhead of the tool is not prohibitive.