System-level performance evaluation of three-dimensional integrated circuits
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
Interconnect characteristics of 2.5-D system integration scheme
Proceedings of the 2001 international symposium on Physical design
3D direct vertical interconnect microprocessors test vehicle
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Efficient Thermal Placement of Standard Cells in 3D ICs using a Force Directed Approach
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
2.5D system integration: a design driven system implementation schema
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
3D Processing Technology and Its Impact on iA32 Microprocessors
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Three-Dimensional Cache Design Exploration Using 3DCacti
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Implementing Caches in a 3D Technology for High Performance Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Bridging the Processor-Memory Performance Gapwith 3D IC Technology
IEEE Design & Test
A thermal-driven floorplanning algorithm for 3D ICs
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Implementing Register Files for High-Performance Microprocessors in a Die-Stacked (3D) Technology
ISVLSI '06 Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures
Thermal-driven multilevel routing for 3-D ICs
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Dynamic instruction schedulers in a 3-dimensional integration technology
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy
Proceedings of the 43rd annual Design Automation Conference
Design space exploration for 3D architectures
ACM Journal on Emerging Technologies in Computing Systems (JETC)
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
A novel dimensionally-decomposed router for on-chip communication in 3D architectures
Proceedings of the 34th annual international symposium on Computer architecture
3D Integration for Introspection
IEEE Micro
Three-dimensional multiprocessor system-on-chip thermal optimization
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
A modular 3d processor for flexible product design and technology migration
Proceedings of the 5th conference on Computing frontiers
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Proceedings of the 45th annual Design Automation Conference
Tera-scale computing and interconnect challenges
Proceedings of the 45th annual Design Automation Conference
Why should we do 3D integration?
Proceedings of the 45th annual Design Automation Conference
PicoServer: Using 3D stacking technology to build energy efficient servers
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Investigating the effects of fine-grain three-dimensional integration on microarchitecture design
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Trend and Challenge on System-on-a-Chip Designs
Journal of Signal Processing Systems
Three-dimensional Integrated Circuit Design
Three-dimensional Integrated Circuit Design
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Efficient implementation of decoupling capacitors in 3D processor-dram integrated computing systems
Proceedings of the 19th ACM Great Lakes symposium on VLSI
Analysis of challenges for on-chip optical interconnects
Proceedings of the 19th ACM Great Lakes symposium on VLSI
Design space exploration for 3-D cache
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A durable and energy efficient main memory using phase change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Hybrid cache architecture with disparate memory technologies
Proceedings of the 36th annual international symposium on Computer architecture
Scaling the bandwidth wall: challenges in and avenues for CMP scaling
Proceedings of the 36th annual international symposium on Computer architecture
Is 3D chip technology the next growth engine for performance improvement?
IBM Journal of Research and Development
Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Variation-tolerant non-uniform 3D cache management in die stacked multicore processor
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
The impact of liquid cooling on 3D multi-core processors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Proceedings of the 37th annual international symposium on Computer architecture
Performance Evaluation of a Multicore System with Optically Connected Memory Modules
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Traffic- and Thermal-Aware Run-Time Thermal Management Scheme for 3D NoC Systems
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Quantifying and coping with parametric variations in 3D-stacked microarchitectures
Proceedings of the 47th Design Automation Conference
Cost-driven 3D integration with interconnect layers
Proceedings of the 47th Design Automation Conference
A multilayer nanophotonic interconnection network for on-chip many-core communications
Proceedings of the 47th Design Automation Conference
An efficient distributed memory interface for many-core platform with 3D stacked DRAM
Proceedings of the Conference on Design, Automation and Test in Europe
Efficient OpenMP data mapping for multicore platforms with vertically stacked memory
Proceedings of the Conference on Design, Automation and Test in Europe
Hardware trust implications of 3-D integration
WESS '10 Proceedings of the 5th Workshop on Embedded Systems Security
Dynamic thermal management in 3D multicore architectures
Proceedings of the Conference on Design, Automation and Test in Europe
Vertical stealing: robust, locality-aware do-all workload distribution for 3D MPSoCs
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Design exploration of hybrid caches with disparate memory technologies
ACM Transactions on Architecture and Code Optimization (TACO)
Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Hardware assistance for trustworthy systems through 3-D integration
Proceedings of the 26th Annual Computer Security Applications Conference
Virtualizing network-on-chip resources in chip-multiprocessors
Microprocessors & Microsystems
Large-scale integrated photonics for high-performance interconnects
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Iris: A hybrid nanophotonic network design for high-performance and low-power on-chip communication
ACM Journal on Emerging Technologies in Computing Systems (JETC)
A composite and scalable cache coherence protocol for large scale CMPs
Proceedings of the international conference on Supercomputing
A vertical bubble flow network using inductive-coupling for 3-D CMPs
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs
Proceedings of the 38th annual international symposium on Computer architecture
Proceedings of the 38th annual international symposium on Computer architecture
Token3D: reducing temperature in 3d die-stacked CMPs through cycle-level power control mechanisms
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Modeling and analysis of micro-ring based silicon photonic interconnect for embedded systems
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Agent-based thermal management using real-time I/O communication relocation for 3D many-cores
PATMOS'11 Proceedings of the 21st international conference on Integrated circuit and system design: power and timing modeling, optimization, and simulation
Modeling the computational efficiency of 2-D and 3-D silicon processors for early-chip planning
Proceedings of the International Conference on Computer-Aided Design
A register-file approach for row buffer caches in die-stacked DRAMs
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Efficiently enabling conventional block sizes for very large die-stacked DRAM caches
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A qualitative security analysis of a new class of 3-d integrated crypto co-processors
Cryptography and Security
Three-dimensional Integrated Circuits: Design, EDA, and Architecture
Foundations and Trends in Electronic Design Automation
Vertical link on/off control methods for wireless 3-d nocs
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Reliability-aware platform optimization for 3D chip multi-processors
The Journal of Supercomputing
Network-on-Chip virtualization in Chip-Multiprocessor Systems
Journal of Systems Architecture: the EUROMICRO Journal
Design-time performance evaluation of thermal management policies for SRAM and RRAM based 3D MPSoCs
Proceedings of the great lakes symposium on VLSI
Proceedings of the 49th Annual Design Automation Conference
Software—Practice & Experience
Spatial and temporal thermal characterization of stacked multicore architectures
ACM Journal on Emerging Technologies in Computing Systems (JETC)
RAIDR: Retention-Aware Intelligent DRAM Refresh
Proceedings of the 39th Annual International Symposium on Computer Architecture
Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
XPoint cache: scaling existing bus-based coherence protocols for 2D and 3D many-core systems
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on embedded systems for interactive multimedia services (ES-IMS)
Cluster-based topologies for 3D Networks-on-Chip using advanced inter-layer bus architecture
Journal of Computer and System Sciences
A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive cache management for a combined SRAM and DRAM cache hierarchy for multi-cores
Proceedings of the Conference on Design, Automation and Test in Europe
3D integration for power-efficient computing
Proceedings of the Conference on Design, Automation and Test in Europe
Resilient die-stacked DRAM caches
Proceedings of the 40th Annual International Symposium on Computer Architecture
Pragmatic integration of an SRAM row cache in heterogeneous 3-D DRAM architecture using TSV
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Analysis and runtime management of 3D systems with stacked DRAM for boosting energy efficiency
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Optimal placement of vertical connections in 3D Network-on-Chip
Journal of Systems Architecture: the EUROMICRO Journal
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Heterogeneous system coherence for integrated CPU-GPU systems
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Direct distributed memory access for CMPs
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
3D die stacking is an exciting new technology that increases transistor density by vertically integrating two or more die with a dense, high-speed interface. The result of 3D die stacking is a significant reduction of interconnect both within a die and across dies in a system. For instance, blocks within a microprocessor can be placed vertically on multiple die to reduce block to block wire distance, latency, and power. Disparate Si technologies can also be combined in a 3D die stack, such as DRAM stacked on a CPU, resulting in lower power higher BW and lower latency interfaces, without concern for technology integration into a single process flow. 3D has the potential to change processor design constraints by providing substantial power and performance benefits. Despite the promising advantages of 3D, there is significant concern for thermal impact. In this research, we study the performance advantages and thermal challenges of two forms of die stacking: Stacking a large DRAM or SRAM cache on a microprocessor and dividing a traditional microarchitecture between two die in a stack Results: It is shown that a 32MB 3D stacked DRAM cache can reduce the cycles per memory access of a twothreaded RMS benchmark on average by 13% and as much as 55% while increasing the peak temperature by a negligible 0.08潞C. Off-die BW and power are also reduced by 66% on average. It is also shown that a 3D floorplan of a high performance microprocessor can simultaneously reduce power 15% and increase performance 15% with a small 14潞C increase in peak temperature. Voltage scaling can reach neutral thermals with a simultaneous 34% power reduction and 8% performance improvement.