Dependence Uniformization: A Loop Parallelization Technique

Authors:
T. H. Tzen;L. M. Ni
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1993

Citing 13
Cited 14

Loop skewing: the wavefront method revisited

International Journal of Parallel Programming
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Compiler algorithms for synchronization

IEEE Transactions on Computers
On data synchronization for multiprocessors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Resource contention in shared-memory multiprocessors: A parameterized performance degradation model

Journal of Parallel and Distributed Computing
Dynamic Processor Self-Scheduling for General Parallel Nested Loops

IEEE Transactions on Computers
Advanced loop parallelization: dependence uniformization and trapezoid self-scheduling

Advanced loop parallelization: dependence uniformization and trapezoid self-scheduling
The parallel execution of DO loops

Communications of the ACM
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing
An Efficient Data Dependence Analysis for Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems
Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees

Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees

A loop parallelization technique for linear dependence vector

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
On Effective Execution of Nonuniform DOACROSS Loops

IEEE Transactions on Parallel and Distributed Systems
Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences

IEEE Transactions on Parallel and Distributed Systems
An efficient algorithm for the run-time parallelization of DOACROSS loops

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Profiling Dependence Vectors for Loop Parallelization

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Interprocedural Transformations for Extracting Maximum Parallelism

ADVIS '02 Proceedings of the Second International Conference on Advances in Information Systems
Transformations techniques for extracting parallelism in non-uniform nested loops

WSEAS Transactions on Computers
Affine and unimodular transformations for non-uniform nested loops

ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
A loop transformation using two parallel region partitioning method

APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Improving parallelism of nested loops with non-uniform dependences

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Maximizing parallelism for non-uniform dependence loops using two parallel region partitioning method

ICESS'04 Proceedings of the First international conference on Embedded Software and Systems
Low power engineering

Embedded Systems Design
A combined technique of non-uniform loops

GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Free scheduling for statement instances of parameterized arbitrarily nested affine loops

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data dependence uniformization, a method for overcoming the difficulties in parallelizing a doubly nested loop with irregular dependence constraints is proposed. This approach is based on the concept of vector decomposition. A simple set of basic dependences is developed from which all dependence constraints can be composed. The set of basic dependences is added to every iteration to replace all original dependences so that the dependence constraints become uniform. An efficient synchronization method ispresented to obey the uniform dependence constraints in every iteration.