Parcae: a system for flexible parallel execution
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
A multi-objective auto-tuning framework for parallel codes
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Taming parallel I/O complexity with auto-tuning
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
OpenMP and MPI application energy measurement variation
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Adaptive parallel tiled code generation and accelerated auto-tuning
International Journal of High Performance Computing Applications
Tools for machine-learning-based empirical autotuning and specialization
International Journal of High Performance Computing Applications
Towards fully automatic auto-tuning: Leveraging language features of Chapel
International Journal of High Performance Computing Applications
Designing and auto-tuning parallel 3-D FFT for computation-communication overlap
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
In this paper, we present a runtime compilation and tuning framework for parallel programs. We extend our prior work on our auto-tuner, Active Harmony, for tunable parameters that require code generation (for example, different unroll factors). For such parameters, our auto-tuner generates and compiles new code on-the-fly. Effectively, we merge traditional feedback directed optimization and just-in-time compilation. We show that our system can leverage available parallelism in today's HPC platforms by evaluating different code-variants on different nodes simultaneously. We evaluate our system on two parallel applications and show that our system can improve runtime execution by up to 46% compared to the original version of the program.