A Hybrid Number System Processor with Geometric and Complex Arithmetic Capabilities
IEEE Transactions on Computers
Generalization of Lambert's reflectance model
SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
Arithmetic on the European Logarithmic Microprocessor
IEEE Transactions on Computers - Special issue on computer arithmetic
A user-programmable vertex engine
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
A reflectance model for computer graphics
SIGGRAPH '81 Proceedings of the 8th annual conference on Computer graphics and interactive techniques
3D graphics LSI core for mobile phone "Z3D"
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
VLSI Implementation of a Low-Power Antilogarithmic Converter
IEEE Transactions on Computers
CMOS VLSI Implementation of a Low-Power Logarithmic Converter
IEEE Transactions on Computers
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
A programmable vertex shader with fixed-point SIMD datapath for low power wireless applications
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Power gating strategies on GPUs
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
In this paper, a low-power GPU architecture is described for the handheld systems with limited power and area budgets. The GPU is designed using logarithmic arithmetic for power- and area-efficient design. For this GPU, a multifunction unit is proposed based on the hybrid number system of floating-point and logarithmic numbers and the matrix, vector, and elementary functions are unified into a single arithmetic unit. It achieves the single-cycle throughput for all these functions, except for the matrix-vector multiplication with 2-cycle throughput. The vertex shader using this function unit as its main datapath shows 49.3% cycle count reduction compared with the latest work for OpenGL transformation and lighting (TnL) kernel. The rendering engine uses also the logarithmic arithmetic for implementing the divisions in pipeline stages. The GPU is divided into triple dynamic voltage and frequency scaling power domains to minimize the power consumption at a given performance level. It shows a performance of 5.26Mvertices/s at 200MHz for the OpenGL TnL and 52.4mW power consumption at 60fps. It achieves 2.47 times performance improvement while reducing 50.5% power and 38.4% area consumption compared with the latest work.