Programming with algebraic structures: design of the MAGMA language
ISSAC '94 Proceedings of the international symposium on Symbolic and algebraic computation
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A cellular automaton traffic flow model for online simulation of traffic
Parallel Computing - Special issue on cellular automata: from modeling to applications
Compiling stencils in high performance Fortran
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Reordering and storage optimizations for scientific programs
Reordering and storage optimizations for scientific programs
Automatic tiling of iterative stencil loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Cache-Efficient Multigrid Algorithms
International Journal of High Performance Computing Applications
Sparsity: Optimization Framework for Sparse Matrix Kernels
International Journal of High Performance Computing Applications
Program generation for the all-pairs shortest path problem
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A domain-specific interpreter for parallelizing a large mixed-language visualisation application
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
Supercomputing applications usually involve the repeated parallel application of discretized differential operators. Difficulties arise with higher-order discretizations of operators on parallel computers because their communications can overlap processors in complex ways. Their correct and efficient implementation requires careful choreography of computation and communication, taking into account the symmetries of the problem and of the computer's communication network. This paper shows how these symmetries can be used to automate the construction of the code for optimized operator computation. This is done with considerable generality by making the symmetries both of the problem and the computer explicit using the language of finitely presented reection (Coxeter) groups, and using coset enumeration to generate and optimize the required code.