Return data interleaving for multi-channel embedded CMPs systems

Authors:
Fei Hong;Aviral Shrivastava;Jongeun Lee
Affiliations:
School of Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ;School of Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ;School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology, Ulsan, South Korea
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2012

Citing 16
Cited 0

On randomly interleaved memories

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The Chinese remainder theorem and the prime memory system

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Access ordering and effective memory bandwidth

Access ordering and effective memory bandwidth
Hitting the memory wall: implications of the obvious

ACM SIGARCH Computer Architecture News
Vector multiprocessors with arbitrated memory access

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Maximizing memory bandwidth for streamed computations

Maximizing memory bandwidth for streamed computations
Algorithmic foundations for a parallel vector access memory system

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Dynamic Access Ordering for Streamed Computations

IEEE Transactions on Computers
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Chip Multithreading: Opportunities and Challenges

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
A study of performance impact of memory controller features in multi-processor server environment

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Modern dram memory systems: performance analysis and scheduling algorithm

Modern dram memory systems: performance analysis and scheduling algorithm
DRAMsim: a memory system simulator

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The M5 Simulator: Modeling Networked Systems

IEEE Micro
Memory scheduling for modern microprocessors

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC 2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.