Proposition for a sequential accelerator in future general-purpose manycore processors and the problem of migration-induced cache misses

Authors:
Pierre Michaud;Yiannakis Sazeides;André Seznec
Affiliations:
INRIA, Rennes, France;University of Cyprus, Nicosia, Cyprus;INRIA, Rennes, France
Venue:
Proceedings of the 7th ACM international conference on Computing frontiers
Year:
2010

Citing 26
Cited 1

Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Chip-Multiprocessor Architecture with Speculative Multithreading

IEEE Transactions on Computers
The Stanford Hydra CMP

IEEE Micro
A Thermal-Aware Superscalar Microprocessor

ISQED '02 Proceedings of the 3rd International Symposium on Quality Electronic Design
Temperature-aware microarchitecture

Proceedings of the 30th annual international symposium on Computer architecture
Reducing power density through activity migration

Proceedings of the 2003 international symposium on Low power electronics and design
Circuit and microarchitectural techniques for reducing cache leakage power

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Mitigating Amdahl's Law through EPI Throttling

Proceedings of the 32nd annual international symposium on Computer Architecture
The Impact of Performance Asymmetry in Emerging Multicore Architectures

Proceedings of the 32nd annual international symposium on Computer Architecture
Reducing the Latency and Area Cost of Core Swapping through Shared Helper Engines

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Heterogeneous Chip Multiprocessors

Computer
Performance implications of single thread migration on a chip multi-core

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors

IEEE Computer Architecture Letters
Techniques for Multicore Thermal Management: Classification and New Exploration

Proceedings of the 33rd annual international symposium on Computer Architecture
Core fusion: accommodating software diversity in chip multiprocessors

Proceedings of the 34th annual international symposium on Computer architecture
A study of thread migration in temperature-constrained multicores

ACM Transactions on Architecture and Code Optimization (TACO)
Thousand core chips: a technology perspective

Proceedings of the 44th annual Design Automation Conference
Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Composable Lightweight Processors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
The shared-thread multiprocessor

Proceedings of the 22nd annual international conference on Supercomputing
Accelerating critical section execution with asymmetric multi-core architectures

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Thread motion: fine-grained power management for multi-core systems

Proceedings of the 36th annual international symposium on Computer architecture
The BubbleWrap many-core: popping cores for sequential acceleration

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Understanding the Thermal Implications of Multi-Core Architectures

IEEE Transactions on Parallel and Distributed Systems

Global register alias table: Boosting sequential program on multi-core

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the number of transistors on a chip doubles with every technology generation, the number of on-chip cores also increases rapidly, making possible in a foreseeable future to design processors featuring hundreds of general-purpose cores. However, though a large number of cores speeds up parallel code sections, Amdahl's law requires speeding up sequential sections too. We argue that it will become possible to dedicate a substantial fraction of the chip area and power budget to achieve high sequential performance. Current general-purpose processors contain a handful of cores designed to be continuously active and run in parallel. This leads to power and thermal constraints that limit the core's performance. We propose removing these constraints with a sequential accelerator (SACC). A SACC consists of several cores designed for ultimate sequential performance. These cores cannot run continuously. A single core is active at any time, the rest of the cores are inactive and power-gated. We migrate the execution periodically to another core to spread heat generation uniformly over the whole SACC area, thus addressing the temperature issue. The SACC will be viable only if it yields significant sequential performance. Migration-induced cache misses may limit performance gains. We propose some solutions to mitigate this problem. We also investigate a migration method using thermal sensors, such that the migration interval depends on the ambient temperature and the migration penalty is negligible under normal thermal conditions.