Can Linux be Rejuvenated without Reboots?

  • Authors:
  • Takeshi Yoshimura;Hiroshi Yamada;Kenji Kono

  • Affiliations:
  • -;-;-

  • Venue:
  • WOSAR '11 Proceedings of the 2011 IEEE Third International Workshop on Software Aging and Rejuvenation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.