On randomly interleaved memories
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Proceedings of the 6th international workshop on Hardware/software codesign
Proceedings of the 27th annual international symposium on Computer architecture
Embedded Multiprocessors: Scheduling and Synchronization
Embedded Multiprocessors: Scheduling and Synchronization
Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
A Comparison of Heuristics for Scheduling DAGs on Multiprocessors
Proceedings of the 8th International Symposium on Parallel Processing
Command Vector Memory Systems: High Performance at Low Cost
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Converting graphical DSP programs into memory constrained software prototypes
RSP '95 Proceedings of the Sixth IEEE International Workshop on Rapid System Prototyping (RSP'95)
Memory Controller Optimizations for Web Servers
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Modern dram memory systems: performance analysis and scheduling algorithm
Modern dram memory systems: performance analysis and scheduling algorithm
DRAMsim: a memory system simulator
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Optimizing synchronization in multiprocessor DSP systems
IEEE Transactions on Signal Processing
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
High-performance and low-energy buffer mapping method for multiprocessor DSP systems
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Design methodologies and tools based on the synchronous dataflow (SDF) model of computation have proven useful for rapid prototyping and implementation of digital signal processing (DSP) applications on multiprocessor systems. One significant problem that arises when mapping applica-tions onto such embedded multiprocessors is the memory wall problem, which is becoming increasingly dominant in multiprocessor environments. In this paper, to help alleviate the memory wall problem, we propose a novel, high-performance buffer mapping policy for SDF-represented DSP applications on multiprocessor systems that support the shared-memory programming model. The proposed pol-icy exploits the bank concurrency of a DRAM main memory system according to the analysis of the major forms of par-allelism. The throughput is measured on both synthetic and real benchmarks. The simulation results show that the pro-posed buffer mapping policy is very useful, especially in memory-intensive applications where the total execution time of computational tasks is relatively small compared to that of memory operations. The performance improvement produced by our method is generally attained at the cost of additional banks and decreased bank utilization.