Hierarchical Z-buffer visibility
SIGGRAPH '93 Proceedings of the 20th annual conference on Computer graphics and interactive techniques
Triangle scan conversion using 2D homogeneous coordinates
HWWS '97 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
The design and analysis of a cache architecture for texture mapping
Proceedings of the 24th annual international symposium on Computer architecture
Prefetching in a texture cache architecture
HWWS '98 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
A user-programmable vertex engine
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Incremental and hierarchical Hilbert order edge equation polygon rasterizatione
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
Ray tracing on programmable graphics hardware
Proceedings of the 29th annual conference on Computer graphics and interactive techniques
SaarCOR: a hardware architecture for ray tracing
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Comparing Reyes and OpenGL on a stream architecture
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Delay streams for graphics hardware
ACM SIGGRAPH 2003 Papers
Graphics for the masses: a hardware rasterization architecture for mobile phones
ACM SIGGRAPH 2003 Papers
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Realtime ray tracing of dynamic scenes on an FPGA chip
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
ParallAX: an architecture for real-time physics
Proceedings of the 34th annual international symposium on Computer architecture
StoreGPU: exploiting graphics processing units to accelerate distributed storage systems
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Using reconfigurable logic to optimise GPU memory accesses
Proceedings of the conference on Design, automation and test in Europe
Programmable and Scalable Architecture for Graphics Processing Units
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
On GPU's viability as a middleware accelerator
Cluster Computing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
A single (unified) shader GPU microarchitecture for embedded systems
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Transactions on High-Performance Embedded Architectures and Compilers IV
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
This paper presents an analysis of the performance of the shader processing units in a modern Graphics Processor Unit (GPU) architecture using real graphic applications. The architecture of a modern GPU is described and a simulator and associated framework used to evaluate the architecture is introduced. The paper analyses the effects in performance of different configurations of the shader processing units and compares a classic GPU with a unified shader GPU. The evaluated unified shader architecture proves to be 15% to 30% more efficient, in terms of area, with a 2% to 7% improvement in performance when compared with a similar non-unified architecture.