Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions

Authors:
Ravi Ponnusamy;Joel Saltz;Alok Choudhary;Yuan-Shin Hwang;Geoffrey Fox
Affiliations:
-;-;-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1995

Citing 17
Cited 20

A Partitioning Strategy for Nonuniform Problems on Multiprocessors

IEEE Transactions on Computers
Principles of runtime support for parallel processors

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Adaptive mesh generation for viscous flows using delaunay triangulation

Journal of Computational Physics
Supporting shared data structures on distributed memory architectures

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Programming abstractions for dynamically partitioning and coordinating localized scientific calculations running on multiprocessors

SIAM Journal on Scientific and Statistical Computing
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The DINO parallel programming language

Journal of Parallel and Distributed Computing
Performance of dynamic load balancing algorithms for unstructured mesh calculations

Concurrency: Practice and Experience
Implementing an irregular application on a distributed memory multiprocessor

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Physical optimization algorithms for mapping data to distributed-memory multiprocessors

Physical optimization algorithms for mapping data to distributed-memory multiprocessors
Runtime compilation techniques for data partitioning and communication schedule reuse

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Compiling Fortran 90D/HPF for distributed memory MIMD computers

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Load-balancing heuristics and process behavior

SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
High Performance Fortran

IEEE Parallel & Distributed Technology: Systems & Technology
Supporting Irregular Distributions Using Data-Parallel Languages

IEEE Parallel & Distributed Technology: Systems & Technology
Parallelizing Loops with Indirect Array References of Pointers

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Slicing Analysis and Indirect Accesses to Distributed Arrays

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing

Measuring the effectiveness of automatic parallelization in SUIF

ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluating Automatic Parallelization in SUIF

IEEE Transactions on Parallel and Distributed Systems
A compiler method for the parallel execution of irregular reductions in scalable shared memory multiprocessors

Proceedings of the 14th international conference on Supercomputing
Runtime and compiler support for irregular computations

Compiler optimizations for scalable parallel systems
Improving parallel irregular reductions using partial array expansion

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Processing large-scale multi-dimensional data in parallel and distributed environments

Parallel Computing - Parallel data-intensive algorithms and applications
Impact of Data Distribution on Performance of Irregular Reductions on Multithreaded Architectures

HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Reducing Communication Cost for Parallelizing Irregular Scientific Codes

PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Irregular Assignment Computations on cc-NUMA Multiprocessors

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Supporting Self-Adaptivity for SPMD Message-Passing Applications

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Dynamic load balancing of distributed SPMD computations with explicit message-passing

HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Run-Time Reference Clustering for Cache Performance Optimization

PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
High Performance Communication between Parallel Programs

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
Parallel techniques in irregular codes: cloth simulation as case of study

Journal of Parallel and Distributed Computing
Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors

The Journal of Supercomputing
Multidimensional Blocking in UPC

Languages and Compilers for Parallel Computing
Balanced, locality-based parallel irregular reductions

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
An execution strategy and optimized runtime support for parallelizing irregular reductions on modern GPUs

Proceedings of the international conference on Supercomputing
Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Concurrency and Computation: Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes two new ideas by which a High Performance Fortran compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of proposed compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a conservative method for compiling irregular loops in which dependence arises only due to reduction operations. This mechanism in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g., communication schedules, loop iteration partitions, and information that associates off-processor data copies with on-processor buffer locations). This paper also presents performance results for these mechanisms from a Fortran 90D compiler implementation.