POWER7: verification challenge of a multi-core processor
Proceedings of the 2009 International Conference on Computer-Aided Design
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Orchestration by approximation: mapping stream programs onto multicore architectures
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Exploiting dynamic micro-architecture usage in gate sizing
Microprocessors & Microsystems
Parallelism and data movement characterization of contemporary application classes
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Understanding POWER multiprocessors
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Adaptive granularity memory systems: a tradeoff between storage efficiency and throughput
Proceedings of the 38th annual international symposium on Computer architecture
IBM POWER7 multicore server processor
IBM Journal of Research and Development
Proceedings of the 48th Design Automation Conference
Learning microarchitectural behaviors to improve stimuli generation quality
Proceedings of the 48th Design Automation Conference
A read-write aware replacement policy for phase change memory
APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
Why nothing matters: the impact of zeroing
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
DAPSCO: Distance-aware partially shared cache organization
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Optimizing matrix transposes using a POWER7 cache model and explicit prefetching
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Minimalist open-page: a DRAM page-mode scheduling policy for the many-core era
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Can manycores support the memory requirements of scientific applications?
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Improving coherence protocol reactiveness by trading bandwidth for latency
Proceedings of the 9th conference on Computing Frontiers
Something old and something new: P-states can borrow microarchitecture techniques too
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Combining recency of information with selective random and a victim cache in last-level caches
ACM Transactions on Architecture and Code Optimization (TACO)
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
PS-Dir: a scalable two-level directory cache
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Optimizing matrix transposes using a POWER7 cache model and explicit prefetching
ACM SIGMETRICS Performance Evaluation Review
Hierarchical power management for adaptive tightly-coupled processor arrays
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
Automatic communication coalescing for irregular computations in UPC language
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
The power 775 architecture at scale
Proceedings of the 27th international ACM conference on International conference on supercomputing
Resilient die-stacked DRAM caches
Proceedings of the 40th Annual International Symposium on Computer Architecture
Protozoa: adaptive granularity cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
Architecturally homogeneous power-performance heterogeneous multicore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Generating instruction streams using abstract CSP
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hazard driven test generation for SMT processors
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Maximizing the performance of irregular applications on multithreaded, NUMA systems
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies
ACM Transactions on Architecture and Code Optimization (TACO)
The benefit of SMT in the multi-core era: flexibility towards degrees of thread-level parallelism
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Heterogeneous-race-free memory models
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hardware support for accurate per-task energy metering in multicore systems
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
The Power7 is IBM's first eight-core processor, with each core capable of four-way simultaneous-multithreading operation. Its key architectural features include an advanced memory hierarchy with three levels of on-chip cache; embedded-DRAM devices used in the highest level of the cache; and a new memory interface. This balanced multicore design scales from 1 to 32 sockets in commercial and scientific environments.