A Simple Way to Estimate the Cost of Downtime

Authors:
Affiliations:
Venue:
LISA '02 Proceedings of the 16th USENIX conference on System administration
Year:
2002

Citing 4
Cited 26

Sources of Failure in the Public Switched Telephone Network

Computer
Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,

Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,
A Retrospective on Twelve Years of LISA Proceedings

LISA '99 Proceedings of the 13th USENIX conference on System administration
Studying and using failure data from large-scale internet services

EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop

System administrators are users, too: designing workspaces for managing internet-scale systems

CHI '03 Extended Abstracts on Human Factors in Computing Systems
Devirtualizable virtual machines enabling general, single-node, online maintenance

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
On the effectiveness of address-space randomization

Proceedings of the 11th ACM conference on Computer and communications security
Getting more from your virtual machine

Journal of Computing Sciences in Colleges
Modeling and Tracking of Transaction Flow Dynamics for Fault Detection in Complex Systems

IEEE Transactions on Dependable and Secure Computing
Cube management system: a tangible interface for monitoring large scale systems

Proceedings of the 2007 symposium on Computer human interaction for the management of information technology
Active internet traffic filtering: real-time response to denial-of-service attacks

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
PoDIM: a language for high-level configuration management

LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Designing tools for system administrators: an empirical test of the integrated user satisfaction model

LISA'08 Proceedings of the 22nd conference on Large installation system administration conference
Ranking the importance of alerts for problem determination in large computer systems

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Autonomic Provisioning for Mobile Commerce

Proceedings of the 2009 conference on Techniques and Applications for Mobile Commerce: Proceedings of TAMoCo 2009
Quantifying the sustainability impact of data center availability

ACM SIGMETRICS Performance Evaluation Review
Proposal on network-wide rollback scheme for fast recovery from operator errors

DSOM'07 Proceedings of the Distributed systems: operations and management 18th IFIP/IEEE international conference on Managing virtualization of networks and services
A service delivery platform for server management services

IBM Journal of Research and Development
To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
We crashed, now what?

HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
A survey of system configuration tools

LISA'10 Proceedings of the 24th international conference on Large installation system administration
FastScale: accelerate RAID scaling by minimizing data migration

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Ranking the importance of alerts for problem determination in large computer systems

Cluster Computing
Quantifying the complexity of IT service management processes

DSOM'06 Proceedings of the 17th IFIP/IEEE international conference on Distributed Systems: operations and management
Integrated management of network and security devices in IT infrastructures

Proceedings of the 7th International Conference on Network and Services Management
Ensuring reliability in B2B services: Fault tolerant inter-organizational workflows

Information Systems Frontiers
Estimating the value of lost telecoms connectivity

Electronic Commerce Research and Applications
Building Highly Dependable Wireless Web Services

Journal of Electronic Commerce in Organizations
Design and Evaluation of a New Approach to RAID-0 Scaling

ACM Transactions on Storage (TOS)
CRAID: online RAID upgrades using dynamic hot data reorganization

FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Systems that are more dependable and less expensive to maintain may be more expensive to purchase. If ordinary customers cannot calculate the costs of downtime, such systems may not succeed because it will be difficult to justify a higher price. Hence, we propose an easy-to-calculate estimate of downtime.As one reviewer commented, the cost estimate we propose "is simply a symbolic translation of the most obvious, common sense approach to the problem." We take this remark as a complement, noting that prior work has ignored pieces of this obvious formula.We introduce this formula, argue why it will be important to have a formula that can easily be calculated, suggest why it will be hard to get a more accurate estimate, and give some examples.Widespread use of this obvious formula can lay a foundation for systems that reduce downtime.