Performance-asymmetry-aware scheduling for Chip Multiprocessors with static core coupling

Authors:
Jianbo Dong;Lei Zhang;Yinhe Han;Guihai Yan;Xiaowei Li
Affiliations:
Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, PR China and Graduate University of Chinese Academy of Sciences, Beijin ...;Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, PR China;Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, PR China and Graduate University of Chinese Academy of Sciences, Beijin ...;Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, PR China and Graduate University of Chinese Academy of Sciences, Beijin ...;Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, PR China and Graduate University of Chinese Academy of Sciences, Beijin ...
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2010

Citing 18
Cited 0

Convergence of an annealing algorithm

Mathematical Programming: Series A and B
DIVA: a reliable substrate for deep submicron microarchitecture design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading

Proceedings of the 27th annual international symposium on Computer architecture
Slipstream processors: improving both performance and fault tolerance

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Detailed design and evaluation of redundant multithreading alternatives

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
IBM's S/390 G5 Microprocessor Design

IEEE Micro
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Parameter variations and impact on circuits and microarchitecture

Proceedings of the 40th annual Design Automation Conference
Transient-fault recovery for chip multiprocessors

Proceedings of the 30th annual international symposium on Computer architecture
The Soft Error Problem: An Architectural Perspective

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Reunion: Complexity-Effective Multicore Redundancy

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Integer linear programming and heuristic techniques for system-level low power scheduling on multiprocessor architectures under throughput constraints

Integration, the VLSI Journal
Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor

DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Impact of process variations on multicore performance symmetry

Proceedings of the conference on Design, automation and test in Europe
Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Within-Die Variation-Aware Scheduling in Superscalar Processors for Improved Throughput

IEEE Transactions on Computers
REPAS: Reliable Execution for Parallel ApplicationS in Tiled-CMPs

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thread-level redundancy is an efficient approach for transient fault detection and recovery in Chip Multiprocessors (CMPs), in which two adjacent cores are statically coupled to form a functional Dual Modular Redundancy (DMR). Manufacturing process variations cause core-to-core (C2C) performance asymmetry across the chip, which can be further divided into the asymmetry among core-pairs and the asymmetry within a core-pair. We call them inter- and intra-pair asymmetries, respectively, both of which should be taken into considerations in application scheduling for CMPs with static core coupling. In this paper, we first formulate the above scheduling problem as a 0-1 programming problem to maximize the system Weighted Throughput. An efficient IVF&AppSen algorithm is then proposed, which we prove to be optimal when the number of applications equals to that of core-pairs. We also adapt the Simulated Annealing technique to tackle this problem when applications are less than core-pairs on chip. Simulations on a 64-core CMP shows that the proposed algorithms achieve 2.5-9.3% improvement in Weighted Throughput when compared to prior VarF&AppIPC algorithm.