The Journal of Supercomputing
Versatile design of shared vector coprocessors for multicores
Microprocessors & Microsystems
Multicore-based vector coprocessor sharing for performance and energy gains
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Hi-index | 0.00 |
The Cell processor consists of a general-purpose core and eight cores with a complete SIMD instruction set. Although originally designed for multimedia and gaming, it is currently being used for a much broader range of applications.In this paper we evaluate if the Cell SPEs could benefit significantly from a scalar processing unit using two methodologies. In the first methodology the scalar processing overhead is eliminated by replacing all scalar data types by the quadword data type. This methodology is feasible only for relatively small kernels. In the second methodology SPE performance is compared to the performance of a similarly configured PPU, which supports scalar operations. Experimental results show that the scalar processing overhead ranges from 19% to 57% for small kernels and from 12% to 39% for large kernels. Solutions to eliminate this overhead are also discussed.