Extending and benchmarking the "Big Memory" implementation on Blue Gene/P Linux
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Hobbes: composition and virtualization as the foundations of an extreme-scale OS/R
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Solving the straggler problem with bounded staleness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Hi-index | 0.00 |
Operating system noise, or “jitter,” is a key limiter of application scalability in high end computing systems. Several studies have attempted to quantify the sources and effects of system interference, though few of these studies show the influence that architectural and system characteristics have on the impact of OS noise at scale. In this paper, we examine the impact of three such system properties: platform balance, “noisy” node distribution, and non-blocking collective operations. Using a previouslydeveloped noise injection tool, we explore how the impact of noise varies with these platform characteristics. We provide detailed performance results that indicate that a system with relatively less network bandwidth is able to absorb more noise than a system with more network bandwidth. Our results also show that application performance can be significantly degraded by only a subset of noisy nodes. Furthermore, the placement of the noisy nodes is also important, especially for applications that make substantial use of collective communication operations that are tree-based. Lastly, performance results indicate that nonblocking collective operations have the ability to greatly mitigate the impact of OS interference. Combined, these results show that the impact of OS noise is not solely a property of application communication behavior, but is also influenced by other properties of the system architecture and system software environment.