Optimal semi-oblique tiling

Authors:
R. Andonov;S. Balev;S. Rajopadhye;N. Yanev
Affiliations:
LAMIH/ROI, Valenciennes, France;LAMIH/ROI, Valenciennes, France;IRISA, Rennes, France;University of Sofia, Sofia, Bulgaria
Venue:
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Year:
2001

Citing 23
Cited 7

Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays

IEEE Transactions on Computers
Computing size-independent matrix problems on systolic array processors

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A bridging model for parallel computation

Communications of the ACM
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Tiling multidimensional iteration spaces for nonshared memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Evaluating compiler optimizations for Fortran D

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
(Pen)-ultimate tiling?

Integration, the VLSI Journal
Tile size selection using cache organization and data layout

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Optimal tile size adjustment in compiling general DOACROSS loop nests

ICS '95 Proceedings of the 9th international conference on Supercomputing
Determining the idle time of a tiling

Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimal orthogonal tiling of 2-D iterations

Journal of Parallel and Distributed Computing
Selecting tile shape for minimal execution time

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
Pipelined Data Parallel Algorithms-I: Concept and Modeling

IEEE Transactions on Parallel and Distributed Systems
Pipelined Data Parallel Algorithms-II: Design

IEEE Transactions on Parallel and Distributed Systems
Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
On Supernode Transformation with Minimized Total Running Time

IEEE Transactions on Parallel and Distributed Systems
Tiling and Processors Allocation for Three Dimensional Iteration Space

HiPC '99 Proceedings of the 6th International Conference on High Performance Computing
Iteration Space Tiling for Memory Hierarchies

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing
Optimal Orthogonal Tiling

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Precise Tiling for Uniform Loop Nests

ASAP '95 Proceedings of the IEEE International Conference on Application Specific Array Processors
Predicting performance for tiled perfectly nested loops

Predicting performance for tiled perfectly nested loops

On tiling space-time mapped loop nests

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Optimal tiling for the RNA base pairing problem

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Towards the automatic optimal mapping of pipeline algorithms

Parallel Computing
DPSKEL: a skeleton based tool for parallel dynamic programming

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Hierarchical overlapped tiling

Proceedings of the Tenth International Symposium on Code Generation and Optimization
A framework for the application of metaheuristics to tasks-to-processors assignation problems

The Journal of Supercomputing
Skeletal based programming for dynamic programming on MultiGPU systems

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

For 2-D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time under the BSP model. We consider uniform dependency computations, tiled so that (at least) one of the tile boundaries is parallel to the domain boundary. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary.Our predictions are validated, among other examples, on a sequence alignment problem specialized to similar sequences using Ficket's “k-band” algorithm, for which, our optimal semi-oblique tiling yields an improvement over orthogonal tiling by a factor of 2.5. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is 3 times slower.