A: an assertion language for distributed systems
Proceedings of the 3rd workshop on Programming languages and operating systems: linguistic support for modern operating systems
Staged deployment in mirage, an integrated software upgrade testing and distribution system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
Declarative management in Microsoft SQL server
Proceedings of the VLDB Endowment
Barricade: defending systems against operator mistakes
Proceedings of the 5th European conference on Computer systems
Splitter: a proxy-based approach for post-migration testing of web applications
Proceedings of the 5th European conference on Computer systems
Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
JustRunIt: experiment-based management of virtualized data centers
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
An empirical study on configuration errors in commercial and open source systems
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
DejaVu: accelerating resource allocation in virtualized environments
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Back to the future: fault-tolerant live update with time-traveling state transfer
LISA'13 Proceedings of the 27th international conference on Large Installation System Administration
Hi-index | 0.00 |
A large number of enterprises need their commodity database systems to remain available at all times. Although administrator mistakes are a significant source of unavailability and cost in these systems, no study to date has sought to quantify the frequency of mistakes in the field, understand the context in which they occur, or develop system support to deal with them explicitly. In this paper, we first characterize the typical administrator tasks, testing environments, and mistakes using results from an extensive survey we have conducted of 51 experienced administrators. Given the results of this survey, we next propose system support to validate administrator actions before they are made visible to users. Our prototype implementation creates a validation environment that is an extension of a replicated database system, where administrator actions can be validated using real workloads. The prototype implements three forms of validation, including a novel form in which the behavior of a database replica can be validated even without an example of correct behavior for comparison. Our results show that the prototype can detect the major classes of administrator mistakes.