A Partitioning Strategy for Nonuniform Problems on Multiprocessors
IEEE Transactions on Computers
Principles of runtime support for parallel processors
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Adaptive mesh generation for viscous flows using delaunay triangulation
Journal of Computational Physics
Supporting shared data structures on distributed memory architectures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
SIAM Journal on Scientific and Statistical Computing
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The DINO parallel programming language
Journal of Parallel and Distributed Computing
Performance of dynamic load balancing algorithms for unstructured mesh calculations
Concurrency: Practice and Experience
Implementing an irregular application on a distributed memory multiprocessor
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Physical optimization algorithms for mapping data to distributed-memory multiprocessors
Physical optimization algorithms for mapping data to distributed-memory multiprocessors
Runtime compilation techniques for data partitioning and communication schedule reuse
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Compiling Fortran 90D/HPF for distributed memory MIMD computers
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Load-balancing heuristics and process behavior
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
IEEE Parallel & Distributed Technology: Systems & Technology
Supporting Irregular Distributions Using Data-Parallel Languages
IEEE Parallel & Distributed Technology: Systems & Technology
Parallelizing Loops with Indirect Array References of Pointers
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Slicing Analysis and Indirect Accesses to Distributed Arrays
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Measuring the effectiveness of automatic parallelization in SUIF
ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluating Automatic Parallelization in SUIF
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 14th international conference on Supercomputing
Runtime and compiler support for irregular computations
Compiler optimizations for scalable parallel systems
Improving parallel irregular reductions using partial array expansion
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
Impact of Data Distribution on Performance of Irregular Reductions on Multithreaded Architectures
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Reducing Communication Cost for Parallelizing Irregular Scientific Codes
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Irregular Assignment Computations on cc-NUMA Multiprocessors
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Supporting Self-Adaptivity for SPMD Message-Passing Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Dynamic load balancing of distributed SPMD computations with explicit message-passing
HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Run-Time Reference Clustering for Cache Performance Optimization
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
High Performance Communication between Parallel Programs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
Parallel techniques in irregular codes: cloth simulation as case of study
Journal of Parallel and Distributed Computing
Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors
The Journal of Supercomputing
Multidimensional Blocking in UPC
Languages and Compilers for Parallel Computing
Balanced, locality-based parallel irregular reductions
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Proceedings of the international conference on Supercomputing
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
This paper describes two new ideas by which a High Performance Fortran compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of proposed compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a conservative method for compiling irregular loops in which dependence arises only due to reduction operations. This mechanism in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g., communication schedules, loop iteration partitions, and information that associates off-processor data copies with on-processor buffer locations). This paper also presents performance results for these mechanisms from a Fortran 90D compiler implementation.