A multigrid solver for boundary value problems using programmable graphics hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Three-Dimensional Cache Design Exploration Using 3DCacti
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy
Proceedings of the 43rd annual Design Automation Conference
Energy/power breakdown of pipelined nanometer caches (90nm/65nm/45nm/32nm)
Proceedings of the 2006 international symposium on Low power electronics and design
A memory model for scientific algorithms on graphics processors
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
ACM SIGGRAPH 2007 courses
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Extending systems-on-chip to the third dimension: performance, cost and technological tradeoffs
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Efficient computation of sum-products on GPUs through software-managed cache
Proceedings of the 22nd annual international conference on Supercomputing
System-level cost analysis and design exploration for three-dimensional integrated circuits (3D ICs)
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Proceedings of the International Conference on Computer-Aided Design
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53% geometric mean performance speedup for iso-cycle time architectures and about 19% for iso-cost architectures.