Comparative evaluation of contiguous allocation strategies on 3D mesh multicomputers
Journal of Systems and Software
Reducing search space of auto-tuners using parallel patterns
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Auto-tuning full applications: A case study
International Journal of High Performance Computing Applications
Run-time automatic performance tuning for multicore applications
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Thread Tranquilizer: Dynamically reducing performance variation
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Analytical bounds for optimal tile size selection
CC'12 Proceedings of the 21st international conference on Compiler Construction
Oversubscription of computational resources on multicore desktop systems
MSEPT'12 Proceedings of the 2012 international conference on Multicore Software Engineering, Performance, and Tools
A performance comparison of the contiguous allocation strategies in 3D mesh connected multicomputers
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Adaptive parallel tiled code generation and accelerated auto-tuning
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
In this paper, we present parallel on-line optimization algorithms for parameter tuning of parallel programs. We employ direct search algorithms that update parameters based on real-time performance measurements. We discuss the impact of performance variability on the accuracy and efficiency of the optimization algorithms and proposed modified versions of the direct search algorithms to cope with it. The modified version uses multiple samples instead of single sample to estimate the performance more accurately.We present preliminary results that the performance variability of applications on clusters is heavy tailed. Finally, we studay and demonstrate the performance ofthe proposed algorithms for real scientific application.