An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,
Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,
STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support
LISA '03 Proceedings of the 17th USENIX conference on System administration
SmartFrog Meets LCFG: Autonomous Reconfiguration with Central Policy Control
LISA '03 Proceedings of the 17th USENIX conference on System administration
Improving user-interface dependability through mitigation of human error
International Journal of Human-Computer Studies - Special isssue: HCI research in privacy and security is critical now
Automated known problem diagnosis with event traces
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Undo for operators: building an undoable e-mail store
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Improved error reporting for software that uses black-box components
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Detecting BGP configuration faults with static analysis
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Understanding and dealing with operator mistakes in internet services
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Configuration debugging as search: finding the needle in the haystack
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Automatic misconfiguration troubleshooting with peerpressure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Why do internet services fail, and what can be done about it?
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Understanding and validating database system administration
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Automatic configuration of internet services
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Staged deployment in mirage, an integrated software upgrade testing and distribution system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
AutoBash: improving configuration management with operating system causality analysis
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Learning from mistakes: a comprehensive study on real world concurrency bug characteristics
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Using causality to diagnose configuration bugs
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Towards automatic reverse engineering of software security configurations
Proceedings of the 15th ACM conference on Computer and communications security
ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Barricade: defending systems against operator mistakes
Proceedings of the 5th European conference on Computer systems
Using symbolic evaluation to understand behavior in configurable software systems
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Enabling configuration-independent automation by non-expert users
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Automating configuration troubleshooting with dynamic information flow analysis
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Static extraction of program configuration options
Proceedings of the 33rd International Conference on Software Engineering
A user survey of configuration challenges in Linux and eCos
Proceedings of the Sixth International Workshop on Variability Modeling of Software-Intensive Systems
Precomputing possible configuration error diagnoses
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the 10th international conference on Mobile systems, applications, and services
Dynamic reconfiguration of primary/backup clusters
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
X-ray: automating root-cause diagnosis of performance anomalies in production software
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Systems research and innovation in data ONTAP
ACM SIGOPS Operating Systems Review
Cloud API issues: an empirical study and impact
Proceedings of the 9th international ACM Sigsoft conference on Quality of software architectures
Automated diagnosis of software configuration errors
Proceedings of the 2013 International Conference on Software Engineering
ConfDiagnoser: an automated configuration error diagnosis tool for Java software
Proceedings of the 2013 International Conference on Software Engineering
SmartFixer: fixing software configurations based on dynamic priorities
Proceedings of the 17th International Software Product Line Conference
Report on the international symposium on high confidence software (ISHCS 2011/2012)
ACM SIGSOFT Software Engineering Notes
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Do not blame users for misconfigurations
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
When the network crumbles: an empirical study of cloud network failures and their impact on services
Proceedings of the 4th annual Symposium on Cloud Computing
VM image update notification mechanism based on pub/sub paradigm in cloud
Proceedings of the 5th Asia-Pacific Symposium on Internetware
EnCore: exploiting system environment and correlation information for misconfiguration detection
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Challenges to error diagnosis in hadoop ecosystems
LISA'13 Proceedings of the 27th international conference on Large Installation System Administration
Hi-index | 0.00 |
Configuration errors (i.e., misconfigurations) are among the dominant causes of system failures. Their importance has inspired many research efforts on detecting, diagnosing, and fixing misconfigurations; such research would benefit greatly from a real-world characteristic study on misconfigurations. Unfortunately, few such studies have been conducted in the past, primarily because historical misconfigurations usually have not been recorded rigorously in databases. In this work, we undertake one of the first attempts to conduct a real-world misconfiguration characteristic study. We study a total of 546 real world misconfigurations, including 309 misconfigurations from a commercial storage system deployed at thousands of customers, and 237 from four widely used open source systems (CentOS, MySQL, Apache HTTP Server, and OpenLDAP). Some of our major findings include: (1) A majority of misconfigurations (70.0%~85.5%) are due to mistakes in setting configuration parameters; however, a significant number of misconfigurations are due to compatibility issues or component configurations (i.e., not parameter-related). (2) 38.1%~53.7% of parameter mistakes are caused by illegal parameters that clearly violate some format or rules, motivating the use of an automatic configuration checker to detect these misconfigurations. (3) A significant percentage (12.2%~29.7%) of parameter-based mistakes are due to inconsistencies between different parameter values. (4) 21.7%~57.3% of the misconfigurations involve configurations external to the examined system, some even on entirely different hosts. (5) A significant portion of misconfigurations can cause hard-to-diagnose failures, such as crashes, hangs, or severe performance degradation, indicating that systems should be better-equipped to handle misconfigurations.