Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient and exact data dependence analysis
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The Omega test: a fast and practical integer programming algorithm for dependence analysis
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
ScaLAPACK user's guide
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Parallel Programming with Polaris
Computer
A unified framework for nonlinear dependence testing and symbolic analysis
Proceedings of the 18th annual international conference on Supercomputing
Code Generation in the Polyhedral Model Is Easier Than You Think
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Data dependence analysis techniques for increased accuracy and extracted parallelism
International Journal of Parallel Programming - Special issue II: The 17th annual international conference on supercomputing (ICS'03)
Proceedings of the 20th annual international conference on Supercomputing
A practical automatic polyhedral parallelizer and locality optimizer
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Distributed SBP Cholesky factorization algorithms with near-optimal scheduling
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Nonlinear Symbolic Analysis for Advanced Program Parallelization
IEEE Transactions on Parallel and Distributed Systems
Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
DAGuE: A Generic Distributed DAG Engine for High Performance Computing
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Performance Portability of a GPU Enabled Factorization with the DAGuE Framework
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
DAGuE: A generic distributed DAG engine for High Performance Computing
Parallel Computing
Hi-index | 0.00 |
Programmability and performance portability are two major challenges in today's dynamic environment. Algorithm designers targeting efficient algorithms should focus on designing high-level algorithms exhibiting maximum parallelism, while relying on compilers and run-time systems to discover and exploit this parallelism, delivering sustainable performance on a variety of hardware. The compiler tool presented in this paper can analyze the data flow of serial codes with imperfectly nested, affine loop-nests and if statements, commonly found in scientific applications. This tool operates as the front-end compiler for the DAGuE run-time system by automatically converting serial codes into the symbolic representation of their data flow. We show how the compiler analyzes the data flow, and demonstrate that scientifically important, dense linear algebra operations can benefit from this analysis, and deliver high performance on large scale platforms.