Optimally Maximizing Iteration-Level Loop Parallelism

Authors:
Duo Liu;Yi Wang;Zili Shao;Minyi Guo;Jingling Xue
Affiliations:
The Hong Kong Polytechnic Univeristy, Hong Kong and Southwest University of Science and Technology, Mianyang;The Hong Kong Polytechnic Univeristy, Hong Kong;The Hong Kong Polytechnic Univeristy, Hong Kong;Shanghai Jiao Tong University, Shanghai;University of New South Wales, Sydney
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2012

Citing 0
Cited 3

A new technique of embedding multigrain parallel HPRC in OR1200 a soft-core processor

SEPADS'12/EDUCATION'12 Proceedings of the 11th WSEAS international conference on Software Engineering, Parallel and Distributed Systems, and proceedings of the 9th WSEAS international conference on Engineering Education
Efficient Loop Scheduling for Chip Multiprocessors with Non-Volatile Main Memory

Journal of Signal Processing Systems
Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Loops are the main source of parallelism in many applications. This paper solves the open problem of extracting the maximal number of iterations from a loop to run parallel on chip multiprocessors. Our algorithm solves it optimally by migrating the weights of parallelism-inhibiting dependences on dependence cycles in two phases. First, we model dependence migration with retiming and formulate this classic loop parallelization into a graph optimization problem, i.e., one of finding retiming values for its nodes so that the minimum nonzero edge weight in the graph is maximized. We present our algorithm in three stages with each being built incrementally on the preceding one. Second, the optimal code for a loop is generated from the retimed graph of the loop found in the first phase. We demonstrate the effectiveness of our optimal algorithm by comparing with a number of representative nonoptimal algorithms using a set of benchmarks frequently used in prior work and a set of graphs generated by TGFF.