Principles of runtime support for parallel processors
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Journal of Parallel and Distributed Computing
Run-time parallelization: its time has come
Parallel Computing - Special issues on languages and compilers for parallel computers
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Improving memory hierarchy performance for irregular applications
ICS '99 Proceedings of the 13th international conference on Supercomputing
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
International Journal of Parallel Programming
Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator
FCRC '96/WACG '96 Selected papers from the Workshop on Applied Computational Geormetry, Towards Geometric Engineering
Compile-time composition of run-time data and iteration reorderings
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Reducing the bandwidth of sparse symmetric matrices
ACM '69 Proceedings of the 1969 24th national conference
Localizing Non-Affine Array References
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Memory Hierarchy Management for Iterative Graph Structures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Parallel reductions: an application of adaptive algorithm selection
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
The Potential of Computation Regrouping for Improving Locality
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A hierarchical model of data locality
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Exploiting Locality for Irregular Scientific Codes
IEEE Transactions on Parallel and Distributed Systems
Predicting locality phases for dynamic memory optimization
Journal of Parallel and Distributed Computing
A framework for scalable greedy coloring on distributed-memory parallel computers
Journal of Parallel and Distributed Computing
Evaluation of Hierarchical Mesh Reorderings
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Parallelization Strategies for Mixed Regular-Irregular Applications on Multicore-Systems
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Adjacency-based data reordering algorithm for acceleration of finite element computations
Scientific Programming
SIAM Journal on Scientific Computing
A parallel distance-2 graph coloring algorithm for distributed memory computers
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A scalable parallel graph coloring algorithm for distributed memory computers
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Partitioning Hypergraphs in Scientific Computing Applications through Vertex Separators on Graphs
SIAM Journal on Scientific Computing
Code generation for parallel execution of a class of irregular loops on distributed memory systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Irregular applications frequently exhibit poor performance on contemporary computer architectures, in large part because of their inefficient use of the memory hierarchy. Run-time data, and iteration-reordering transformations have been shown to improve the locality and therefore the performance of irregular benchmarks. This paper describes models for determining which combination of run-time data- and iteration-reordering heuristics will result in the best performance for a given dataset. We propose that the data- and iteration-reordering transformations be viewed as approximating minimal linear arrangements on two separate hypergraphs: a spatial locality hypergraph and a temporal locality hypergraph. Our results measure the efficacy of locality metrics based on these hypergraphs in guiding the selection of data-and iteration-reordering heuristics. We also introduce new iteration- and data-reordering heuristics based on the hypergraph models that result in better performance than do previous heuristics.