SPM Conscious Loop Scheduling for Embedded Chip Multiprocessors

Authors:
Liping Xue;Mahmut Kandemir;Guangyu Chen;Taylan Yemliha
Affiliations:
Pennsylvania State University, USA;Pennsylvania State University, USA;Pennsylvania State University, USA;Syracuse University, USA
Venue:
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Year:
2006

Citing 12
Cited 1

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
The Omega Library interface guide

The Omega Library interface guide
Advanced compiler design and implementation

Advanced compiler design and implementation
Parametric Analysis of Polyhedral Iteration Spaces

Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
Exploiting shared scratch pad memory space in embedded multiprocessor systems

Proceedings of the 39th annual Design Automation Conference
Energy-Aware Runtime Scheduling for Embedded-Multiprocessor SOCs

IEEE Design & Test
Locality Optimizations for Parallel Machines

CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
LODS: locality-oriented dynamic scheduling for on-chip multiprocessors

Proceedings of the 41st annual Design Automation Conference
An integrated hardware/software approach for run-time scratchpad management

Proceedings of the 41st annual Design Automation Conference
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Locality and Loop Scheduling on NUMA Multiprocessors

ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 02

Memory bank aware dynamic loop scheduling

Proceedings of the conference on Design, automation and test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the major factors that can potentially slow down widespread use of embedded chip multiprocessors is lack of efficient software support. In particular, automated code parallelizers are badly needed since it is not realistic to expect an average programmer to parallelize a large complex embedded application over multiple processors, taking into account several factors at the same time such as code density, data locality, performance, power and code resilience. Especially, increasing use of software-managed SPM (scratch-pad memory) components in embedded systems require an SPM conscious code parallelization. Motivated by this observation, this paper proposes a novel compiler-based SPM conscious loop scheduling strategy for array/loop based embedded applications. This strategy tries to achieve two objectives. First, the sets of loop iterations assigned to different processors should approximately take the same amount of time to finish. Second, the set of iterations assigned to a processor should exhibit high data reuse. Satisfying these two objectives help us to minimize parallel execution time of the application at hand. The specific method adopted by our scheduling strategy to achieve these objectives is to distribute loop iterations across parallel processors in an SPM conscious manner. In this strategy, the compiler analyzes the loop, identifies the potential SPM hits and misses, and distributes loop iterations over processors such that the processors have more or less the same execution time. Our experimental results so far indicate that the proposed approach generates much better results than existing loop schedulers. Specifically, it brings 18.9%, 22.4%, and 11.1% improvements in parallel execution time (with a chip multiprocessor of 8 cores) over a previously proposed static scheduler, a dynamic scheduler, and an alternate locality-conscious scheduler, respectively.