Thread Tranquilizer: Dynamically reducing performance variation
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
There goes the neighborhood: performance degradation due to nearby jobs
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Runtime irreproducibility complicates application performance evaluation on today’s high performance computers. Performance can vary significantly between seemingly identical runs; this presents a challenge to benchmarking as well as a user, who is trying to determine whether the change they made to their code is an actual improvement. In order to gain a better understanding of this phenomenon, we measure the runtime variation of two applications, PARAllel Total Energy Code (PARATEC) and Weather Research and Forecasting (WRF), on three different machines. Key associated metrics are also recorded. The data is then used to 1) quantify the magnitude and distribution of the variations and 2) gain an understanding as why the variations occur. Using our lightweight framework, Integrated Performance Monitoring (IPM), to understand the performance characteristics of individual runs, and the Inca framework to automate the procedure measurements were collected over a month’s time. The results indicate that performance can vary up to 25% and is almost always due to contention for network resources. We also found that the variation differs between machines and is almost always greater on machines with lower performing networks.