Recovering device drivers

  • Authors:
  • Michael M. Swift;Muthukaruppan Annamalai;Brian N. Bershad;Henry M. Levy

  • Affiliations:
  • University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA

  • Venue:
  • ACM Transactions on Computer Systems (TOCS)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article presents a new mechanism that enables applications to run correctly when device drivers fail. Because device drivers are the principal failing component in most systems, reducing driver-induced failures greatly improves overall reliability. Earlier work has shown that an operating system can survive driver failures [Swift et al. 2005], but the applications that depend on them cannot. Thus, while operating system reliability was greatly improved, application reliability generally was not.To remedy this situation, we introduce a new operating system mechanism called a shadow driver. A shadow driver monitors device drivers and transparently recovers from driver failures. Moreover, it assumes the role of the failed driver during recovery. In this way, applications using the failed driver, as well as the kernel itself, continue to function as expected.We implemented shadow drivers for the Linux operating system and tested them on over a dozen device drivers. Our results show that applications and the OS can indeed survive the failure of a variety of device drivers. Moreover, shadow drivers impose minimal performance overhead. Lastly, they can be introduced with only modest changes to the OS kernel and with no changes at all to existing device drivers.