The multicluster architecture: reducing cycle time through partitioning
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Real-time rendering
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
The Silicon Graphics 4D/240GTX Superworkstation
IEEE Computer Graphics and Applications
3D graphics LSI core for mobile phone "Z3D"
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
SH-X: an embedded processor core for consumer appliances
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Hi-index | 0.00 |
In this paper, a power efficient vertex processor for mobile graphics applications is presented. A four-threaded and four-issue expanded VLIW datapath with a quad-float vertex texture fetcher is proposed by exploiting graphics specific characteristics after evaluation of several candidate architectures. Instruction-level power control methods such as operand sharing and writeback re-allocation along with operand isolations and gated clocks result in 40.4% and 82% reduction in energy dissipation and energy delay product compared to the most widely used single threaded SIMD. The proposed processor with the optimized datapath and vertex caches implemented in a 0.18-µm 1P4M CMOS process achieves 186-Mvertices/s geometry performance which is the best result among the processors that are IEEE-754 compliant.