A modular 3d processor for flexible product design and technology migration
Proceedings of the 5th conference on Computing frontiers
MIRA: A Multi-layered On-Chip Interconnect Router Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
A novel migration-based NUCA design for chip multiprocessors
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Exploring serial vertical interconnects for 3D ICs
Proceedings of the 46th Annual Design Automation Conference
Variation-tolerant non-uniform 3D cache management in die stacked multicore processor
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Test-wrapper optimization for embedded cores in TSV-based three-dimensional SOCs
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
3D GPU architecture using cache stacking: performance, cost, power and thermal analysis
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
The impact of liquid cooling on 3D multi-core processors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Proceedings of the 37th annual international symposium on Computer architecture
Traffic- and Thermal-Aware Run-Time Thermal Management Scheme for 3D NoC Systems
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Dynamic thermal management in 3D multicore architectures
Proceedings of the Conference on Design, Automation and Test in Europe
Exploiting narrow-width values for thermal-aware register file designs
Proceedings of the Conference on Design, Automation and Test in Europe
Microprocessors & Microsystems
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Proceedings of the 38th annual international symposium on Computer architecture
Thermal-aware floorplan schemes for reliable 3D multi-core processors
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part II
Token3D: reducing temperature in 3d die-stacked CMPs through cycle-level power control mechanisms
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Three-dimensional Integrated Circuits: Design, EDA, and Architecture
Foundations and Trends in Electronic Design Automation
Recent thermal management techniques for microprocessors
ACM Computing Surveys (CSUR)
Reliability-aware platform optimization for 3D chip multi-processors
The Journal of Supercomputing
Exploiting narrow-width values for process variation-tolerant 3-D microprocessors
Proceedings of the 49th Annual Design Automation Conference
Spatial and temporal thermal characterization of stacked multicore architectures
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on embedded systems for interactive multimedia services (ES-IMS)
Thermal-constrained task allocation for interconnect energy reduction in 3-D homogeneous MPSoCs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both latency and power benefits due to reductions in critical wires. However, 3D stacking of active devices can potentially exacerbate existing thermal problems. In this work, we propose a family of Thermal Herding techniques that (1) reduces 3D power density and (2) locates a majority of the power on the top die closest to the heat sink. Our 3D/thermal-aware microarchitecture contributions include a significance-partitioned datapath that places the frequently switching 16-bits on the top die, a 3D-aware instruction scheduler allocation scheme, an address memoization approach for the load and store queues, a partial value encoding for the L1 data cache, and a branch target buffer that exploits a form of frequent partial value locality in target addresses. Compared to a conventional planar processor, our 3D processor achieves a 47.9% frequency increase which results in a 47.0% performance improvement (min 7%, max 77% on individual benchmarks), while simultaneously reducing total power by 20% (min 15%, max 30%). Without our Thermal Herding techniques, the worst case 3D temperature increases by 17 degrees. With our Thermal Herding techniques, the temperature increase is only 12 degrees (29% reduction in the 3D worst-case temperature increase).