A two-level scheduling method: an effective parallelizing technique for uniform nested loops on a DSP multiprocessor

Authors:
Yi-Hsuan Lee;Cheng Chen
Affiliations:
Department of Computer Science and Information Engineering, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan 30050, PR China;Department of Computer Science and Information Engineering, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan 30050, PR China
Venue:
Journal of Systems and Software - Special issue: Software engineering education and training
Year:
2005

Citing 10
Cited 1

VLSI array processors

VLSI array processors
Achieving Full Parallelism Using Multidimensional Retiming

IEEE Transactions on Parallel and Distributed Systems
Scheduling of uniform multidimensional systems under resource constraints

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The parallel execution of DO loops

Communications of the ACM
VLSI Digital Signal Processors: An Introduction to Rapid Prototyping and Design Synthesis

VLSI Digital Signal Processors: An Introduction to Rapid Prototyping and Design Synthesis
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Optimizing synchronous systems for multi-dimensional applications

EDTC '95 Proceedings of the 1995 European conference on Design and Test
Algorithm and Hardware Support for Branch Anticipation

GLS '97 Proceedings of the 7th Great Lakes Symposium on VLSI
Communication-sensitive loop scheduling for DSP applications

IEEE Transactions on Signal Processing

Economic analysis of testing homogeneous Manycore chips

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A digital signal processor (DSP), which is a special-purpose microprocessor, is designed to achieve higher performance on DSP applications. Because most DSP applications contain many nested loops and permit a very high degree of parallelism, the DSP multiprocessor has a suitable architecture to execute these applications. Unfortunately, conventional scheduling methods used on DSP multiprocessors allocate only one operation to each DSP every time unit, even if the DSP includes several function units that can operate in parallel. Obviously they cannot achieve full function unit utilization. Hence, in this paper, we propose a two-level scheduling method (TSM) to overcome this common failing. TSM contains two approaches, which integrates unimodular transformations, loop tiling technique, and conventional methods used on single DSP. Besides introducing algorithm, we also use an analytic module to analyze its preliminary performance. Based on our analyses the TSM can achieve shorter execution time and more scalable speedup results. In addition, the TSM causes less memory access and synchronization overheads, which are usually negligible in the DSP multiprocessor architecture.