Variable Partitioning and Scheduling for MPSoC with Virtually Shared Scratch Pad Memory

Authors:
Lei Zhang;Meikang Qiu;Wei-Che Tseng;Edwin H.-M. Sha
Affiliations:
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Sichuan, P. R. China 610054;Department of Electrical Engineering, University of New Orleans, New Orleans, USA 70148;Department of Computer Science, University of Texas at Dallas, Richardson, USA 75083;Department of Computer Science, University of Texas at Dallas, Richardson, USA 75083
Venue:
Journal of Signal Processing Systems
Year:
2010

Citing 20
Cited 3

Optimal loop parallelization

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A performance study of software and hardware data prefetching schemes

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimizing Overall Loop Schedules Using Prefetching and Partitioning

IEEE Transactions on Parallel and Distributed Systems
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
An optimal memory allocation for application-specific multiprocessor system-on-chip

Proceedings of the 14th international symposium on Systems synthesis
Exploiting shared scratch pad memory space in embedded multiprocessor systems

Proceedings of the 39th annual Design Automation Conference
Scheduling Data-Flow Graphs via Retiming and Unfolding

IEEE Transactions on Parallel and Distributed Systems
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Task Graph Extraction for Embedded System Synthesis

VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Cluster assignment of global values for clustered VLIW processors

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
An Integer Linear Programming Based Approach to Simultaneous Memory Space Partitioning and Data Allocation for Chip Multiprocessors

ISVLSI '06 Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures
Customized on-chip memories for embedded chip multiprocessors

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Multi-Level On-Chip Memory Hierarchy Design for Embedded Chip Multiprocessors

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Integrated scratchpad memory optimization and task scheduling for MPSoC architectures

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Partitioning and scheduling DSP applications with maximal memory access hiding

EURASIP Journal on Applied Signal Processing
Rotation scheduling: a loop pipelining algorithm

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

ILP optimal scheduling for multi-module memory

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Algorithms for optimally arranging multicore memory structures

EURASIP Journal on Embedded Systems
Partitioned scheduling for real-time tasks on multiprocessor embedded systems with programmable shared srams

Proceedings of the tenth ACM international conference on Embedded software

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the most critical components that determine the success of an MPSoC based architecture is its on-chip memory. Scratch Pad Memory (SPM) is increasingly being applied to substitute cache as the on-chip memory of embedded MPSoCs due to its superior chip area, power consumption and timing predictability. SPM can be organized as a Virtually Shared SPM (VS-SPM) architecture that takes advantage of both shared and private SPM. However, making effective use of the VS-SPM architecture strongly depends on two inter-dependent problems: variable partitioning and task scheduling. In this paper, we decouple these two problems and solve them in phase-ordered manner. We propose two variable partitioning heuristics based on an initial schedule: High Access Frequency First (HAFF) variable partitioning and Global View Prediction (GVP) variable partitioning. Then, we present a loop pipeline scheduling algorithm known as Rotation Scheduling with Variable Partitioning (RSVP) to improve overall throughput. Our experimental results obtained on MiBench show that the average performance improvements over IDAS (Integrated Data Assignment with Scheduling) are 23.74% for HAFF and 31.91% for GVP on four-core MPSoC. The average schedule length generated by RSVP is 25.96% shorter than that of list scheduling with optimal variable partition.