Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
A methodology for implementing highly concurrent data structures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Inside Windows NT
Fast mutual exclusion for uniprocessors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The synergy between non-blocking synchronization and operating system structure
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Dealing with disaster: surviving misbehaved kernel extensions
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Fast Interrupt Priority Management in Operating System Kernels
USENIX Microkernels and Other Kernel Architectures Symposium
Eliminating receive livelock in an interrupt-driven kernel
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Hi-index | 0.01 |
In an operating system kernel, critical sections of code must be protected from interruption. This is traditionally accomplished by masking the set of interrupts whose handlers interfere with the correct operation of the critical section. Because it can be expensive to communicate with an off-chip interrupt controller, more complex optimistic techniques for masking interrupts have been proposed. In this paper we present measurements of the behavior of the NetBSD 1.2 kernel, and use the measurements to explore the space of kernel synchronization schemes. We show that (a) most critical sections are very short, (b) very few are ever interrupted, (c) using the traditional synchronization technique, the synchronization cost is often higher than the time spent in the body of the critical section, and (d) under heavy load NetBSD 1.2 can spend 9% to 12% of its time in synchronization primitives. The simplest scheme we examined, disabling all interrupts while in a critical section or interrupt handler, can lead to loss of data under heavy load. A more complex optimistic scheme functions correctly under the heavy workloads we tested and has very low overhead (at most 0.3%). Based on our measurements, we present a new model that offers the simplicity of the traditional scheme with the performance of the optimistic schemes. Given the relative CPU, memory, and device performance of today's hardware, the newer techniques we examined have a much lower synchronization cost than the traditional technique. Under heavy load, such as that incurred by a web server, a system using these newer techniques will have noticeably better performance.