To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains

Authors:
Tudor Dumitras;Priya Narasimhan;Eli Tilevich
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Virginia Tech, Blacksburg, VA, USA
Venue:
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Year:
2010

Citing 21
Cited 3

Dynamic Configuration for Distributed Systems

IEEE Transactions on Software Engineering
Lessons from Giant-Scale Services

IEEE Internet Computing
Online Software Upgrading: New Research Directions and Practical Considerations

COMPSAC '02 Proceedings of the 26th International Computer Software and Applications Conference on Prolonging Software Life: Development and Redevelopment
Bug isolation via remote program sampling

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
The dimensions of maintenance

ICSE '76 Proceedings of the 2nd international conference on Software engineering
Live Upgrades of CORBA Applications Using Object Replication

ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Dynamic software updating

Dynamic software updating
A Simple Way to Estimate the Cost of Downtime

LISA '02 Proceedings of the 16th USENIX conference on System administration
Timing the Application of Security Patches for Optimal Uptime

LISA '02 Proceedings of the 16th USENIX conference on System administration
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Why do internet services fail, and what can be done about it?

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Understanding and validating database system administration

ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Staged deployment in mirage, an integrated software upgrade testing and distribution system

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Convenience Over Correctness

IEEE Internet Computing
Consistently applying updates to compositions of distributed OSGi modules

Proceedings of the 1st International Workshop on Hot Topics in Software Upgrades
Online application upgrade using edition-based redefinition

Proceedings of the 2nd International Workshop on Hot Topics in Software Upgrades
Ecotopia: an ecological framework for change management in distributed systems

Architecting dependable systems IV
PACER: proportional detection of data races

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Why do upgrades fail and what can we do about it?: toward dependable, online upgrades in enterprise system

Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
JustRunIt: experiment-based management of virtualized data centers

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Modular software upgrades for distributed systems

ECOOP'06 Proceedings of the 20th European conference on Object-Oriented Programming

Analyzing software updates: should you build a dynamic updating infrastructure?

FASE'11/ETAPS'11 Proceedings of the 14th international conference on Fundamental approaches to software engineering: part of the joint European conferences on theory and practice of software
Safe and automatic live update for operating systems

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Back to the future: fault-tolerant live update with time-traveling state transfer

LISA'13 Proceedings of the 27th international conference on Large Installation System Administration

Quantified Score

Hi-index	0.00

Visualization

Abstract

Online software upgrades are often plagued by runtime behaviors that are poorly understood and difficult to ascertain. For example, the interactions among multiple versions of the software expose the system to race conditions that can introduce latent errors or data corruption. Moreover, industry trends suggest that online upgrades are currently needed in large-scale enterprise systems, which often span multiple administrative domains (e.g., Web 2.0 applications that rely on AJAX client-side code or systems that lease cloud-computing resources). In such systems, the enterprise does not control all the tiers of the system and cannot coordinate the upgrade process, making existing techniques inadequate to prevent mixed-version races. In this paper, we present an analytical framework for impact assessment, which allows system administrators to directly compare the risk of following an online-upgrade plan with the risk of delaying or canceling the upgrade. We also describe an executable model that implements our formal impact assessment and enables a systematic approach for deciding whether an online upgrade is appropriate. Our model provides a method of last resort for avoiding undesirable program behaviors, in situations where mixed-version races cannot be avoided through other technical means.