Exploring the VLSI Scalability of Stream Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Evaluating the Imagine Stream Architecture
Proceedings of the 31st annual international symposium on Computer architecture
Compiling for stream processing
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A 64-bit stream processor architecture for scientific applications
Proceedings of the 34th annual international symposium on Computer architecture
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Exploiting loop-dependent stream reuse for stream processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Embedded DSP Processor Design: Application Specific Instruction Set Processors
SRF coloring: stream register file allocation via graph coloring
Journal of Computer Science and Technology
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Implementation and optimization of dense LU ecomposition on the stream processor
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Implementing and optimizing a data-intensive hydrodynamics application on the stream processor
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III
Symbiote: a Reconfigurable Logic Assisted Data Stream Management System (RLADSMS)
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Exploiting the reuse supplied by loop-dependent stream references for stream processors
ACM Transactions on Architecture and Code Optimization (TACO)
Optimization and evaluating of StreamYGX2 on MASA stream processor
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Tiled multi-core stream architecture
Transactions on High-Performance Embedded Architectures and Compilers IV
Simulation-based evaluation of the Imagine stream processor with scientific programs
International Journal of High Performance Computing and Networking
Laplace transformation on the FT64 stream processor
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Generating synthetic task graphs for simulating stream computing systems
Journal of Parallel and Distributed Computing
The Journal of Supercomputing
Hi-index | 0.00 |
Stream Processor Architecture presents the architecture of the Imagine streaming media processor, which delivers a peak performance of 20 billion floating-point operations per second. Imagine efficiently supports 48 arithmetic units with a three-tiered data bandwidth hierarchy. At the base of the hierarchy, the streaming memory system employs memory access scheduling to maximize the sustained bandwidth of external DRAM. At the center of the hierarchy, the global stream register file enables streams of data to be recirculated directly from one computation kernel to the next without returning data to memory. Finally, local distributed register files that directly feed the arithmetic units enable temporary data to be stored locally so that it does not need to consume costly global register bandwidth. The bandwidth hierarchy enables Imagine to achieve up to 96% of the performance of a stream processor with infinite bandwidth from memory and the global register file.