POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
IBM Journal of Research and Development
IBM POWER6 accelerators: VMX and DFU
IBM Journal of Research and Development
IBM Journal of Research and Development
POWER4 system microarchitecture
IBM Journal of Research and Development
Embedded DRAM: technology platform for the Blue Gene/L chip
IBM Journal of Research and Development
IBM Journal of Research and Development
IBM Journal of Research and Development
PERCS: the IBM power7-IH high-performance computing system
IBM Journal of Research and Development
Functional verification of the IBM POWER7 microprocessor and POWER7 multiprocessor systems
IBM Journal of Research and Development
Optimizing matrix transposes using a POWER7 cache model and explicit prefetching
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Enhancing the performance of assisted execution runtime systems through hardware/software techniques
Proceedings of the 26th ACM international conference on Supercomputing
Composable, non-blocking collective operations on power7 IH
Proceedings of the 26th ACM international conference on Supercomputing
RAIDR: Retention-Aware Intelligent DRAM Refresh
Proceedings of the 39th Annual International Symposium on Computer Architecture
A case for exploiting subarray-level parallelism (SALP) in DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
Making data prefetch smarter: adaptive prefetching on POWER7
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Optimizing matrix transposes using a POWER7 cache model and explicit prefetching
ACM SIGMETRICS Performance Evaluation Review
Fair CPU time accounting in CMP+SMT processors
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
The power 775 architecture at scale
Proceedings of the 27th international ACM conference on International conference on supercomputing
Combining RAM technologies for hard-error recovery in L1 data caches working at very-low power modes
Proceedings of the Conference on Design, Automation and Test in Europe
Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
A tool to analyze the performance of multithreaded programs on NUMA architectures
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Applications of the streamed storage format for sparse matrix operations
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
The IBM POWER® processor is the dominant reduced instruction set computing microprocessor in the world today, with a rich history of implementation and innovation over the last 20 years. In this paper, we describe the key features of the POWER7® processor chip. On the chip is an eight-core processor, with each core capable of four-way simultaneous multithreaded operation. Fabricated in IBM's 45-nm silicon-on-insulator (SOI) technology with 11 levels of metal, the chip contains more than one billion transistors. The processor core and caches are significantly enhanced to boost the performance of both single-threaded response-time-oriented, as well as multithreaded, throughput-oriented applications. The memory subsystem contains three levels of on-chip cache, with SOI embedded dynamic random access memory (DRAM) devices used as the last level of cache. A new memory interface using buffered double-data-rate-three DRAM and improvements in reliability, availability, and serviceability are discussed