Analysis and simulation of a fair queueing algorithm
SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Journal of the ACM (JACM)
IEEE/ACM Transactions on Networking (TON)
Efficient fair queueing using deficit round robin
SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Soft timers: efficient microsecond software timer support for network processing
ACM Transactions on Computer Systems (TOCS)
Interposed proportional sharing for a storage service utility
Proceedings of the joint international conference on Measurement and modeling of computer systems
Helios: heterogeneous multiprocessing with satellite kernels
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
FlexSC: flexible system call scheduling with exception-less system calls
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A taxonomy of accelerator architectures and their programming models
IBM Journal of Research and Development
TimeGraph: GPU scheduling for real-time multi-tasking environments
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Pegasus: coordinated scheduling for virtualized accelerator-based systems
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
PTask: operating system abstractions to manage GPUs as compute devices
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Globally scheduled real-time multiprocessor systems with GPUs
Real-Time Systems
Operating systems should manage accelerators
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Gdev: first-class GPU resource management in the operating system
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Supporting Preemptive Task Executions and Memory Copies in GPGPUs
ECRTS '12 Proceedings of the 2012 24th Euromicro Conference on Real-Time Systems
Hardware acceleration in the IBM PowerEN processor: architecture and performance
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
FlashFQ: a fair queueing I/O scheduler for flash-based SSDs
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Enabling OS research by inferring interactions in the black-box GPU stack
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.00 |
Today's operating systems treat GPUs and other computational accelerators as if they were simple devices, with bounded and predictable response times. With accelerators assuming an increasing share of the workload on modern machines, this strategy is already problematic, and likely to become untenable soon. If the operating system is to enforce fair sharing of the machine, it must assume responsibility for accelerator scheduling and resource management. Fair, safe scheduling is a particular challenge on fast accelerators, which allow applications to avoid kernel-crossing overhead by interacting directly with the device. We propose a disengaged scheduling strategy in which the kernel intercedes between applications and the accelerator on an infrequent basis, to monitor their use of accelerator cycles and to determine which applications should be granted access over the next time interval. Our strategy assumes a well defined, narrow interface exported by the accelerator. We build upon such an interface, systematically inferred for the latest Nvidia GPUs. We construct several example schedulers, including Disengaged Timeslice with overuse control that guarantees fairness and Disengaged Fair Queueing that is effective in limiting resource idleness, but probabilistic. Both schedulers ensure fair sharing of the GPU, even among uncooperative or adversarial applications; Disengaged Fair Queueing incurs a 4% overhead on average (max 18%) compared to direct device access across our evaluation scenarios.