Supercompilers for parallel and vector computers
Supercompilers for parallel and vector computers
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Theory, techniques, and experiments in solving recurrences in computer programs
Theory, techniques, and experiments in solving recurrences in computer programs
Adaptive reduction parallelization techniques
Proceedings of the 14th international conference on Supercomputing
Machine Learning
Parallel Programming with Polaris
Computer
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Techniques for Reducing the Overhead of Run-Time Parallelization
CC '00 Proceedings of the 9th International Conference on Compiler Construction
On the Automatic Parallelization of Sparse and Irregular Fortran Programs
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A Comparison of Locality Transformations for Irregular Codes
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Improving Compiler and Run-Time Support for Adaptive Irregular Codes
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Compile-time composition of run-time data and iteration reorderings
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Metrics and models for reordering transformations
MSP '04 Proceedings of the 2004 workshop on Memory system performance
An Adaptive Algorithm Selection Framework for Reduction Parallelization
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Irregular and dynamic memory reference patterns can cause significant performance variations for low level algorithms in general and especially for parallel algorithms. We have previously shown that parallel reduction algorithms are quite input sensitive and thus can benefit from an adaptive, reference pattern directed selection. In this paper we extend our previous work by detailing a systematic approach to dynamically select the best parallel algorithm. First we model the characteristics of the input, i.e., the memory reference pattern, with a descriptor vector. Then we measure the performance of several reduction algorithms for various values of the pattern descriptor. Finally we establish a (many-to-one) mapping (function) between a finite set of descriptor values and a set of algorithms. We thus obtain a performance ranking of the available algorithms with respect to a limited set of descriptor values. The actual dynamic selection code is generated using statistical regression methods or a decision tree. Finally we present experimental results to validate our modeling and prediction techniques.