Theory of linear and integer programming
Theory of linear and integer programming
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies
IEEE Transactions on Computers
Independent Partitioning of Algorithms with Uniform Dependencies
IEEE Transactions on Computers
Comparative analysis of methods for broadcase elimination
Parallel Computing
Synthesis aspects in the design of efficient processor arrays from affine recurrence equations
Journal of Symbolic Computation - Special issue on automatic programming
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Data broadcasting in linearly scheduled array processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scheduling and partitioning for multiple loop nests
Proceedings of the 14th international symposium on Systems synthesis
Combined partitioning and data padding for scheduling multiple loop nests
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Hexagonal systolic arrays for matrix multiplication
Highly parallel computaions
Partitioning Loops with Variable Dependence Distances
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Storage requirement estimation for optimized design of data intensive applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Register aware scheduling for distributed cache clustered architecture
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Guidance of Loop Ordering for Reduced Memory Usage in Signal Processing Applications
Journal of Signal Processing Systems
Embedded Systems Design
Hi-index | 14.98 |
This paper deals with the problem of transforming irregular data dependence structures of algorithms with nested loops into more regular ones. Algorithms under consideration are n-dimensional algorithms (algorithms with n nested loops) with affine dependences where dependences are affine functions of index variables of the loop. Methods are proposed to uniformize affine dependence algorithms, i.e., to transform affine dependence algorithms into uniform dependence algorithms where dependences are independent of the index variables (constant). Objectives are considered to guide the selection of feasible uniformizations. The first one is to reduce the number of dependences after uniformization. The second one is to maximize parallelism preserved by the uniformization. Some parallelism might be lost due to the uniformization. The parallelism preserved by the uniformization is measured by 1) the total execution time by the optimal linear schedule which assigns each computation in the algorithm an execution time according to a linear function of the index of the computation, and 2) the size of the cone spanned by the dependence vectors after uniformization.