Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
Theory of linear and integer programming
Theory of linear and integer programming
Program partitioning and synchronization on multiprocessor systems
Program partitioning and synchronization on multiprocessor systems
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
Multiprocessors: discussion of some theoretical and practical problems
Multiprocessors: discussion of some theoretical and practical problems
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Algorithm transformations for parallel processing and vlsi architecture design
Algorithm transformations for parallel processing and vlsi architecture design
ICS '94 Proceedings of the 8th international conference on Supercomputing
Optimal Scheduling of Compute-Intensive Tasks on a Network of Workstations
IEEE Transactions on Parallel and Distributed Systems
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers
IEEE Transactions on Parallel and Distributed Systems
An Approach to Designing Modular Extensible Linear Arrays for Regular Algorithms
IEEE Transactions on Computers
Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays
IEEE Transactions on Parallel and Distributed Systems
On Uniformization of Affine Dependence Algorithms
IEEE Transactions on Computers
On Loop Transformations for Generalized Cycle Shrinking
IEEE Transactions on Parallel and Distributed Systems
On Supernode Transformation with Minimized Total Running Time
IEEE Transactions on Parallel and Distributed Systems
On Time Optimal Supernode Shape
IEEE Transactions on Parallel and Distributed Systems
Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Mapping of Affine Loop Nests onto Independent Processors
Cybernetics and Systems Analysis
Journal of Parallel and Distributed Computing
Message-passing code generation for non-rectangular tiling transformations
Parallel Computing
Hi-index | 14.99 |
Uniform dependence algorithms with arbitrary index sets are considered, and two computationally inexpensive methods to find their independent partitions are proposed. Each method has advantages over the other one for certain kinds of applications, and they both outperform previously proposed approaches in terms of computational complexity and/or optimality. Also, lower and upper bounds are given for the cardinality of maximal independent partitions. In multiple instruction multiple data (MIMD) systems, if different blocks of an independent partition are assigned to different processors, communications between processors will be minimized to zero. This is significant because the communications usually dominate the overhead in MIMD machines.