Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Optimizing pipelines for power and performance
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Power-Constrained Microprocessor Design
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
High Performance Computing with the Cell Broadband Engine
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Breaking the petaflops barrier
IBM Journal of Research and Development
Hi-index | 0.00 |
The Cell Broadband Engine® (Cell/B.E.) processor was designed to provide a mix of central cores for control code and accelerators optimized for data processing. A heterogeneous design allows different processor elements to be optimized for specific functions and makes each processor element more area and power efficient. Cell/B.E. processor-based systems are the most powerful and the most power-efficient systems in the world, as represented by the Top500™ and Green500 lists. This paper offers a new view of the architectural design choices that were made in consideration of software usability and application development for the Cell/B.E. processor. Specifically, we explore the concept of integrated executables that allow a single application to execute across multiple heterogeneous processor elements. Hardware and software architectures were co-optimized to allow an application executing on multiple heterogeneous cores to efficiently communicate and share data, which is key to exploiting chip multiprocessors with ever-increasing numbers of cores.