Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
IEEE Transactions on Parallel and Distributed Systems
Adaptive reduction parallelization techniques
Proceedings of the 14th international conference on Supercomputing
Efficient compiler and run-time support for parallel irregular reductions
Parallel Computing - special issue on parallel computing for irregular applications
Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences
IEEE Transactions on Parallel and Distributed Systems
Graphics Gems
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
On the Automatic Parallelization of Sparse and Irregular Fortran Programs
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A GSA-based compiler infrastructure to extract parallelism from complex loops
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Exploiting Locality in the Run-Time Parallelization of Irregular Loops
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Irregular Assignment Computations on cc-NUMA Multiprocessors
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Balanced, locality-based parallel irregular reductions
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
System-scenario-based design of dynamic embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Streaming-oriented parallelization of domain-independent irregular kernels
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
A scalable, efficient scheme for evaluation of stencil computations over unstructured meshes
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Fix the code. Don't tweak the hardware: A new compiler approach to Voltage-Frequency scaling
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
A loop with irregular assignment computations contains loop-carried output data dependences that can only be detected at run-time. In this paper, a load-balanced method based on the inspector-executor model is proposed to parallelize this loop pattern. The basic idea lies in splitting the iteration space of the sequential loop into sets of conflict-free iterations that can be executed concurrently on different processors. As will be demonstrated, this method outperforms existing techniques. Irregular access patterns with different load-balancing and reusability properties are considered in the experiments.