Hierarchical Z-buffer visibility
SIGGRAPH '93 Proceedings of the 20th annual conference on Computer graphics and interactive techniques
Triangle scan conversion using 2D homogeneous coordinates
HWWS '97 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
A user-programmable vertex engine
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Incremental and hierarchical Hilbert order edge equation polygon rasterizatione
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
Delay streams for graphics hardware
ACM SIGGRAPH 2003 Papers
Graphics for the masses: a hardware rasterization architecture for mobile phones
ACM SIGGRAPH 2003 Papers
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Shader Performance Analysis on a Modern GPU Architecture
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Partitioning of Vertex Shader for Low Power High Performance Geometry Engine
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Co-synthesis of FPGA-based application-specific floating point simd accelerators
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Parallel frame rendering: trading responsiveness for energy on a mobile GPU
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
We present and evaluate the TILA-rin GPU microarchitecture for embedded systems using the ATTILA GPU simulation framework. We use a trace from an execution of the Unreal Tournament 2004 PC game to eval uate and compare the performance of the proposed embedded GPU against a baseline GPU architecture for the PC. We evaluate the different elements that have been removed from the baseline GPU architecture to accommodate the architecture to the restricted power, bandwidth and area budgets of em bedded systems. The unified shader architecture we present processes verti ces, triangles and fragments in a single processing unit saving space and re ducing hardware complexity. The proposed embedded GPU architecture sustains 20 frames per second on the selected UT 2004 trace.