Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Bridging the Processor-Memory Performance Gapwith 3D IC Technology
IEEE Design & Test
A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy
Proceedings of the 43rd annual Design Automation Conference
Three-dimensional integrated circuits
IBM Journal of Research and Development - Advanced silicon technology
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Die Stacking (3D) Microarchitecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Supporting vertical links for 3D networks-on-chip: toward an automated design and analysis flow
Proceedings of the 2nd international conference on Nano-Networks
Achieving predictable performance through better memory controller placement in many-core CMPs
Proceedings of the 36th annual international symposium on Computer architecture
Cluster-based topologies for 3D stacked architectures
Proceedings of the 8th ACM International Conference on Computing Frontiers
Refinement-Based modeling of 3d nocs
FSEN'11 Proceedings of the 4th IPM international conference on Fundamentals of Software Engineering
A distributed interleaving scheme for efficient access to WideIO DRAM memory
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Asymmetric DRAM synthesis for heterogeneous chip multiprocessors in 3D-stacked architecture
Proceedings of the International Conference on Computer-Aided Design
Distributed memory interface synthesis for network-on-chips with 3D-stacked DRAMs
Proceedings of the International Conference on Computer-Aided Design
Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
Cluster-based topologies for 3D Networks-on-Chip using advanced inter-layer bus architecture
Journal of Computer and System Sciences
Architecture and optimal configuration of a real-time multi-channel memory controller
Proceedings of the Conference on Design, Automation and Test in Europe
An energy efficient DRAM subsystem for 3D integrated SoCs
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Direct distributed memory access for CMPs
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Historically, processor performance has increased at a much faster rate than that of main memory and up-coming NoC-based many-core architectures are further tightening the memory bottleneck. 3D integration based on TSV technology may provide a solution, as it enables stacking of multiple memory layers, with orders-of-magnitude increase in memory interface bandwidth, speed and energy efficiency. To fully exploit this potential, the architectural interface to vertically stacked memory must be streamlined. In this paper we present an efficient and flexible distributed memory interface for 3D-stacked DRAM. Our interface ensures ultra-low-latency access to the memory modules on top of each processing element (vertically local memory neighborhoods). Communication to these local modules do not travel through the NoC and takes full advantage of the lower latency of vertical interconnect, thus speeding up significantly the common case. The interface still supports a convenient global address space abstraction with high-latency remote access, due to the slower horizontal interconnect. Experimental results demonstrate significant bandwidth improvement that ranges from 1.44x to 7.40x as compared to the JEDEC standard, with peaks of 4.53GB/s for direct memory access, and 850MB/s for remote access through the NoC.