3D GPU architecture using cache stacking: performance, cost, power and thermal analysis

Authors:
Ahmed Al Maashri;Guangyu Sun;Xiangyu Dong;Vijay Narayanan;Yuan Xie
Affiliations:
Department of Computer Science and Engineering, Penn State University;Department of Computer Science and Engineering, Penn State University;Department of Computer Science and Engineering, Penn State University;Department of Computer Science and Engineering, Penn State University;Department of Computer Science and Engineering, Penn State University
Venue:
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Year:
2009

Citing 12
Cited 2

A multigrid solver for boundary value problems using programmable graphics hardware

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Temperature-aware microarchitecture

Proceedings of the 30th annual international symposium on Computer architecture
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Three-Dimensional Cache Design Exploration Using 3DCacti

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy

Proceedings of the 43rd annual Design Automation Conference
Energy/power breakdown of pipelined nanometer caches (90nm/65nm/45nm/32nm)

Proceedings of the 2006 international symposium on Low power electronics and design
A memory model for scientific algorithms on graphics processors

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
GPU architecture overview

ACM SIGGRAPH 2007 courses
Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Extending systems-on-chip to the third dimension: performance, cost and technological tradeoffs

Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Efficient computation of sum-products on GPUs through software-managed cache

Proceedings of the 22nd annual international conference on Supercomputing
System-level cost analysis and design exploration for three-dimensional integrated circuits (3D ICs)

Proceedings of the 2009 Asia and South Pacific Design Automation Conference

Optimizing bandwidth and power of graphics memory with hybrid memory technologies and adaptive data migration

Proceedings of the International Conference on Computer-Aided Design
Optimizing GPU energy efficiency with 3D die-stacking graphics memory and reconfigurable memory interface

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53% geometric mean performance speedup for iso-cycle time architectures and about 19% for iso-cost architectures.