Time Optimal Linear Schedules for Algorithms with Uniform Dependencies

Authors:
Weijia Shang;Jose A. B. Fortes
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
1991

Citing 12
Cited 40

Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays

IEEE Transactions on Computers
Regular interactive algorithms and their implementations on processor arrays

Regular interactive algorithms and their implementations on processor arrays
Theory of linear and integer programming

Theory of linear and integer programming
VLSI array processors

VLSI array processors
A design methodology for synthesizing parallel algorithms and architectures

Journal of Parallel and Distributed Computing
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Scheduling, partitioning and mapping of uniform dependence algorithms on processor arrays

Scheduling, partitioning and mapping of uniform dependence algorithms on processor arrays
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
The parallel execution of DO loops

Communications of the ACM
Automatic synthesis of systolic arrays from uniform recurrent equations

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Multiprocessors: discussion of some theoretical and practical problems

Multiprocessors: discussion of some theoretical and practical problems
Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees

Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees

Analysis of free schedule in periodic graphs

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Compiler techniques for maximizing fine-grain and coarse-grain parallelism in loops with uniform dependences

ICS '94 Proceedings of the 8th international conference on Supercomputing
Valid Transformations: A New Class of Loop Transformations for High-Level Synthesis and Pipelined Scheduling Applications

IEEE Transactions on Parallel and Distributed Systems
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers

IEEE Transactions on Parallel and Distributed Systems
Finding Space-Time Transformations for Uniform Recurrences viaBranching Parametric Linear Programming

Journal of VLSI Signal Processing Systems
High throughput pipelined data path synthesis by conserving the regularity of nested loops

ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
A Unifying Lattice-Based Approach for the Partitioning of Systolic Arrays via LPGS and LSGP

Journal of VLSI Signal Processing Systems
An Efficient Solution to the Cache Thrashing Problem Caused by True Data Sharing

IEEE Transactions on Computers
Automatic Generation of Modular Time-Space Mappings and Data Alignments

Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
On Time Optimal Implementation of Uniform Recurrences onto Array Processors via Quadratic Programming

Journal of VLSI Signal Processing Systems
An Approach to Checking Link Conflicts in the Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays

IEEE Transactions on Computers
Finding Quadratic Schedules for Affine Recurrence Equations Via Nonsmooth Optimization

Journal of VLSI Signal Processing Systems
Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays

IEEE Transactions on Parallel and Distributed Systems
Design of Processor Arrays for Reconfigurable Architectures

The Journal of Supercomputing
Design of Space-Optimal Regular Arrays for Algorithms with Linear Schedules

IEEE Transactions on Computers
On Uniformization of Affine Dependence Algorithms

IEEE Transactions on Computers
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays

IEEE Transactions on Parallel and Distributed Systems
On Loop Transformations for Generalized Cycle Shrinking

IEEE Transactions on Parallel and Distributed Systems
Constructive Methods for Scheduling Uniform Loop Nests

IEEE Transactions on Parallel and Distributed Systems
On Supernode Transformation with Minimized Total Running Time

IEEE Transactions on Parallel and Distributed Systems
On Time Optimal Supernode Shape

IEEE Transactions on Parallel and Distributed Systems
Generation of Injective and Reversible Modular Mappings

IEEE Transactions on Parallel and Distributed Systems
Automatic generation of injective modular mappings

ICPP '97 Proceedings of the international Conference on Parallel Processing
Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A New Loop Partition Method-Clustering

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
An introduction to processor-time-optimal systolic arrays

Highly parallel computaions
Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A pipelined schedule to minimize completion time for loop tiling with computation and communication overlapping

Journal of Parallel and Distributed Computing
Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays

Journal of Parallel and Distributed Computing
On Scheduling Mesh-Structured Computations for Internet-Based Computing

IEEE Transactions on Computers
Efficient implementation of nested-loop multimedia algorithms

EURASIP Journal on Applied Signal Processing
Cronus: A platform for parallel code generation based on computational geometry methods

Journal of Systems and Software
A reindexing based approach towards mapping of DAG with affine schedules onto parallel embedded systems

Journal of Parallel and Distributed Computing
On minimizing register usage of linearly scheduled algorithms with uniform dependencies

Computer Languages, Systems and Structures
Geometric scheduling of 2-D UET-UCT uniform dependence loops

EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Determining objective functions in systolic array designs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On supernode transformations and multithreading for the longest common subsequence problem

AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
A Case Study of Implementing Supernode Transformations

International Journal of Parallel Programming

Quantified Score

Hi-index	15.00

Visualization

Abstract

The authors address the problem of identifying optimal linear schedules for uniform dependence algorithms so that their execution time is minimized. Procedures are proposed to solve this problem based on the mathematical solution of a nonlinear optimization problem. The complexity of these procedures is independent of the size of the algorithm. Actually, the complexity is exponential in the dimension of the index set of the algorithm, and for all practical purposes, very small due to the limited dimension of the index set of algorithms of practical interest. A particular class of algorithms for which the proposed solution is greatly simplified is considered, and the corresponding simpler organization procedure is provided.