Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Proceedings of the 27th annual international symposium on Computer architecture
×pipes Lite: A Synthesis Oriented Design Library For Networks on Chips
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
DRAMsim: a memory system simulator
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The design space of data-parallel memory systems
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Solutions for Real Chip Implementation Issues of NoC and Their Application to Memory-Centric NoC
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Access Regulation to Hot-Modules in Wormhole NoCs
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Predator: a predictable SDRAM memory controller
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Effective Management of DRAM Bandwidth in Multicore Processors
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Architecture design exploration of three-dimensional (3D) integrated DRAM
ISQED '09 Proceedings of the 2009 10th International Symposium on Quality of Electronic Design
Multiprocessor System-on-Chip designs with active memory processors for higher memory efficiency
Proceedings of the 46th Annual Design Automation Conference
3D DRAM Design and Application to 3D Multicore Systems
IEEE Design & Test
Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A Network Congestion-Aware Memory Controller
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Networks on Chips: from research to products
Proceedings of the 47th Design Automation Conference
An efficient distributed memory interface for many-core platform with 3D stacked DRAM
Proceedings of the Conference on Design, Automation and Test in Europe
System-level power/performance evaluation of 3D stacked DRAMs for mobile applications
Proceedings of the Conference on Design, Automation and Test in Europe
A case for multi-channel memories in video recording
Proceedings of the Conference on Design, Automation and Test in Europe
An SDRAM-aware router for networks-on-chip
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
3D integration for energy efficient system design
Proceedings of the 48th Design Automation Conference
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
An efficient quality-aware memory controller for multimedia platform SoC
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
Achieving the main memory (DRAM) required bandwidth at acceptable power levels for current and future applications is a major challenge for System-on-Chip designers for mobile platforms. Three dimensional (3D) integration and 3D stacked DRAM memories promise to provide a significant boost in bandwidth at low power levels by exploiting multiple channels and wide data interfaces. In this paper, we address the problem of efficiently exploiting the multiple channels provided by standard (JEDEC's WIDE-IO) 3D-stacked memories, to extract maximal effective bandwidth and minimize latency for main memory access. We propose a new distributed interleaved access method that leverages the on-chip interconnect to simplify the design and implementation of the DRAM controller, without impacting performance compared to traditional centralized implementations. We perform experiments on realistic workload for a mobile communication and multimedia platform and show that our proposed distributed interleaving memory access method improves the overall throughput while minimally impacting the performance of latency sensitive communication flows.