Computer graphics (2nd ed. in C): principles and practice
Computer graphics (2nd ed. in C): principles and practice
"Partially Rounded" Small-Order Approximations for Accurate, Hardware-Oriented, Table-Based Methods
ARITH '03 Proceedings of the 16th IEEE Symposium on Computer Arithmetic (ARITH-16'03)
3D graphics LSI core for mobile phone "Z3D"
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Shader Performance Analysis on a Modern GPU Architecture
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A Floating-Point Unit for 4D Vector Inner Product with Reduced Latency
IEEE Transactions on Computers
High-performance special function unit for programmable 3-D graphics processors
IEEE Transactions on Circuits and Systems Part I: Regular Papers
IEEE Transactions on Computers
Hi-index | 0.00 |
We embed special function units (SFUs) in homogeneous stream processors (SPs) within a graphics processing unit (GPU), to improve its performance in running modern programmable shaders, which make poor use of a single-instruction multiple-data (SIMD) architecture. We also compact instructions, so as to reduce the size of the instruction memory, and reduce area requirements by using a partial SFU in SPs, and a lookup table which is shared between multiple SFUs. The result is an increase of 88% in utilization and a reduction in the normalized area-delay product of 27%, compared to a baseline SIMD architecture. We verified our architecture on an field-programmable gate-array evaluation platform with an ARM9 host processor and a full 3-D graphics pipeline.