Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Exploiting shared scratch pad memory space in embedded multiprocessor systems
Proceedings of the 39th annual Design Automation Conference
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies
Proceedings of the 43rd annual Design Automation Conference
Integrated scratchpad memory optimization and task scheduling for MPSoC architectures
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Hi-index | 0.00 |
Multi-core Digital Signal Processors (DSP) have significant requirements on data storage and memory performance for high performance embedded applications. Scratch-pad memories (SPM) are low capacity high-speed on-chip memories mapped with global addresses, which are preferred by embedded applications than traditional caches due to their better real-time characterization. We construct a new Fast Close-Coupled Shared Data Pool (FCC-SDP) for our multi-core DSP project based on SPMs. FCC-SDP is organized as multibank parallel structure with double-bank interleaving access modes, and provides a fast transmission path for fine-grain shared data among DSP cores. We build the behavior simulator of FCC-SDP and make design realization. Simulation experiments with several typical benchmarks show that FCC-SDP can well capture the fine-grain shared data in multi-core applications, and can achieve average speedup ratio of 1.1 and 1.14 compared with traditional shared L2 caches and DMA transmission modes respectively.