Cronus: A platform for parallel code generation based on computational geometry methods

Authors:
Theodore Andronikos;Florina M. Ciorba;Panayiotis Theodoropoulos;Dimitrios Kamenopoulos;George Papakonstantinou
Affiliations:
Department of Informatics, Ionian University, 7, Tsirigoti Square, 49100 Corfu, Greece;Computing Systems Laboratory, Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou Campus, 15773 Athens, Greece;Computing Systems Laboratory, Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou Campus, 15773 Athens, Greece;Computing Systems Laboratory, Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou Campus, 15773 Athens, Greece;Computing Systems Laboratory, Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou Campus, 15773 Athens, Greece
Venue:
Journal of Systems and Software
Year:
2008

Citing 10
Cited 0

Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays

IEEE Transactions on Computers
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies

IEEE Transactions on Computers
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The quickhull algorithm for convex hulls

ACM Transactions on Mathematical Software (TOMS)
Loop tiling for parallelism

Loop tiling for parallelism
The parallel execution of DO loops

Communications of the ACM
On parallelization of UET/UET-UCT loops

Neural, Parallel & Scientific Computations
Compiling Tiled Iteration Spaces for Clusters

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
A novel modular systolic array architecture for full-search block matching motion estimation

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes Cronus, a platform for parallelizing general nested loops. General nested loops contain complex loop bodies (assignments, conditionals, repetitions) and exhibit uniform loop-carried dependencies. The novelty of Cronus is twofold: (1) it determines the optimal scheduling hyperplane using the QuickHull algorithm, which is more efficient than previously used methods, and (2) it implements a simple and efficient dynamic rule (successive dynamic scheduling) for the runtime scheduling of the loop iterations along the optimal hyperplane. This scheduling policy enhances data locality and improves the makespan. Cronus provides an efficient runtime library, specifically designed for communication minimization, that performs better than more generic systems, such as Berkeley UPC. Its performance was evaluated through extensive testing. Three representative case studies are examined: the Floyd-Steinberg dithering algorithm, the Transitive Closure algorithm, and the FSBM motion estimation algorithm. The experimental results corroborate the efficiency of the parallel code. The tests show speedup ranging from 1.18 (out of the ideal 4) to 12.29 (out of the ideal 16) on distributed-systems and 3.60 (out of 4) to 15.79 (out of 16) on shared-memory systems. Cronus outperforms UPC by 5-95% depending on the test case.