Implementing Caches in a 3D Technology for High Performance Processors

Authors:
Kiran Puttaswamy;Gabriel H. Loh
Affiliations:
Georgia Institute of Technology School of Electrical and Computer Engineering;College of Computing
Venue:
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Year:
2005

Citing 15
Cited 31

VLSI design in the 3rd dimension

Integration, the VLSI Journal
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
System-level performance evaluation of three-dimensional integrated circuits

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Design Challenges of Technology Scaling

IEEE Micro
3D direct vertical interconnect microprocessors test vehicle

Proceedings of the 13th ACM Great Lakes symposium on VLSI
Fabrication Technologies for Three-Dimensional Integrated Circuits

ISQED '02 Proceedings of the 3rd International Symposium on Quality Electronic Design
Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools

ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
Picking Statistically Valid and Early Simulation Points

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Technology, performance, and computer-aided design of three-dimensional integrated circuits

Proceedings of the 2004 international symposium on Physical design
Efficient Thermal Placement of Standard Cells in 3D ICs using a Force Directed Approach

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
2.5D system integration: a design driven system implementation schema

Proceedings of the 2004 Asia and South Pacific Design Automation Conference
3D Processing Technology and Its Impact on iA32 Microprocessors

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Thermal-driven multilevel routing for 3-D ICs

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Floorplanning for 3-D VLSI design

Proceedings of the 2005 Asia and South Pacific Design Automation Conference

Reducing the Data Switching Activity on Serial Link Buses

ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
Thermal analysis of a 3D die-stacked high-performance microprocessor

GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Dynamic instruction schedulers in a 3-dimensional integration technology

GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory

Proceedings of the 33rd annual international symposium on Computer Architecture
Design space exploration for 3D architectures

ACM Journal on Emerging Technologies in Computing Systems (JETC)
Introspective 3D chips

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Die Stacking (3D) Microarchitecture

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Analysis of hardware prefetching across virtual page boundaries

Proceedings of the 4th international conference on Computing frontiers
A novel dimensionally-decomposed router for on-chip communication in 3D architectures

Proceedings of the 34th annual international symposium on Computer architecture
Scalability of 3D-integrated arithmetic units in high-performance microprocessors

Proceedings of the 44th annual Design Automation Conference
A modular 3d processor for flexible product design and technology migration

Proceedings of the 5th conference on Computing frontiers
A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
MIRA: A Multi-layered On-Chip Interconnect Router Architecture

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Design space exploration for 3-D cache

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Scaling the bandwidth wall: challenges in and avenues for CMP scaling

Proceedings of the 36th annual international symposium on Computer architecture
Wafer-level 3D integration technology

IBM Journal of Research and Development
Exploring serial vertical interconnects for 3D ICs

Proceedings of the 46th Annual Design Automation Conference
Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
The impact of liquid cooling on 3D multi-core processors

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A novel DRAM architecture as a low leakage alternative for SRAM caches in a 3D interconnect context

Proceedings of the Conference on Design, Automation and Test in Europe
A 3-D cache with ultra-wide data bus for 3-D processor-memory integration

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hardware assistance for trustworthy systems through 3-D integration

Proceedings of the 26th Annual Computer Security Applications Conference
OPAL: a multi-layer hybrid photonic NoC for 3D ICs

Proceedings of the 16th Asia and South Pacific Design Automation Conference
Thermal-aware floorplan schemes for reliable 3D multi-core processors

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part II
3D NOC for many-core processors

Microelectronics Journal
Efficiently enabling conventional block sizes for very large die-stacked DRAM caches

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A qualitative security analysis of a new class of 3-d integrated crypto co-processors

Cryptography and Security
Exploration of heuristic scheduling algorithms for 3D multicore processors

Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
Spatial and temporal thermal characterization of stacked multicore architectures

ACM Journal on Emerging Technologies in Computing Systems (JETC)
Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors

ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
Power consumption and performance analysis of 3D NoCs

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

3D integration is an emergent technology that has the potential to greatly increase device density while simultaneously providing faster on-chip communication. 3D fabrication involves stacking two or more die connected with a very high-density and low-latency interface. The die-to-die vias that comprise this interface can be treated like regular on-chip metal due to their small size (on the order of 1µm) and high speed (sub-FO4 die-to-die communication delay). The increased device density and the ability to place and route in the third dimension provide new opportunities for microarchitecture design. In this paper, we first present a brief overview of 3D integration technology. We then focus on the design of onchip caches using 3D integration. In particular, we show that the dense die-to-die vias enable caches that are 3Dpartitioned at the level of individual wordlines or bitlines. This results in a wire length reduction within SRAM arrays, and a reduction in the footprint of individual SRAM banks, which reduces the global routing from the edge of the cache to the banks and back. The wire length reduction provides both power and performance benefits, e.g., 21.5% latency reduction and 30.9% energy reduction for a 512KB cache. We also report that implementing only the caches in 3D, without accounting for possible benefits from implementing other components of the processor in 3D, results in a 12% IPC gain. These results demonstrate some of the potential of this new technology, and motivate further research in 3D microarchitectures.