Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
System noise, OS clock ticks, and fine-grained parallel applications
Proceedings of the 19th annual international conference on Supercomputing
Characterizing application sensitivity to OS interference using kernel-level noise injection
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Tessellation: space-time partitioning in a manycore client OS
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Characterizing the impact of using spare-cores on application performance
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
A Quantitative Analysis of OS Noise
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Transparently consistent asynchronous shared memory
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Hi-index | 0.00 |
Scientific applications are interrupted by the operating system far too often. Historically operating systems have been written efficiently to time-share a single resource, the CPU. We now have an abundance of cores but we are still swapping out the application to run other tasks and therefore increasing the application's time to solution. Current task scheduling in Linux is not tuned for a high performance computing environment, where a single job is running on all available cores. For example, checking for context switches hundreds of times per second is counter-productive in this setting. One solution to this problem is to partition the cores between operating system and application; with the advent of many-core processors this approach is more attractive. This work describes our investigation of isolation of application processes from the operating system using a soft-partitioning scheme. We use increasingly invasive approaches; from configuration changes with available Linux features such as control groups and pinning interrupts using the CPU affinity settings, to invasive source level code changes to try to reduce, or in some cases completely eliminate, application interruptions such as OS clock ticks and timers. Explained here are the measures that can be taken to reduce application interruption solely with compile and run time configurations in a recent unmodified Linux kernel. Although these measures have been available for a some time, to our knowledge, they have never been addressed in an HPC context. We then introduce our invasive method, where we remove the involuntary preemption induced by task scheduling. Our experiments show that parallel applications benefit from these modifications even at relatively small scales. At the modest scale of our testbed, we see a 1.72% improvement that should project into higher benefits at extreme scales.