A task remapping technique for reliable multi-core embedded systems

Authors:
Chanhee Lee;Hokeun Kim;Hae-woo Park;Sungchan Kim;Hyunok Oh;Soonhoi Ha
Affiliations:
Seoul National University, Seoul, South Korea;Seoul National University, Seoul, South Korea;Seoul National University, Seoul, South Korea;Chonbuk National University, Jeonju, South Korea;Hanyang University, Seoul, South Korea;Seoul National University, Seoul, South Korea
Venue:
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Year:
2010

Citing 22
Cited 10

The Full-Use-of-Suitable-Spares (FUSS) Approach to Hardware Reconfiguration for Fault-Tolerant Processor Arrays

IEEE Transactions on Computers
Task Allocation for Maximizing Reliability of Distributed Computer Systems

IEEE Transactions on Computers
Dynamic Task Allocation Models for Large Distributed Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Loop Transformations for Fault Detection in Regular Loops on Massively Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
TGFF: task graphs for free

Proceedings of the 6th international workshop on Hardware/software codesign
A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis

IEEE Transactions on Parallel and Distributed Systems
Matching and Scheduling Algorithms for Minimizing Execution Time and Failure Probability of Applications in Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
Efficient Task Migration Algorithm for Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
Failure detection algorithms for a reliable execution of parallel programs

SRDS '95 Proceedings of the 14TH Symposium on Reliable Distributed Systems
(R) FAST: A Low-Complexity Algorithm for Efficient Scheduling of DAGs on Parallel Processors

ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 2
Low Cost Task Migration Initiation in a Heterogeneous MP-SoC

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Supporting task migration in multi-processor systems-on-chip: a feasibility study

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic task binding for hardware/software reconfigurable networks

SBCCI '06 Proceedings of the 19th annual symposium on Integrated circuits and systems design
SHAPES:: a tiled scalable software hardware architecture platform for embedded systems

CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Fault-Tolerant Systems

Fault-Tolerant Systems
Temperature aware task scheduling in MPSoCs

Proceedings of the conference on Design, automation and test in Europe
Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules

CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Reliable multiprocessor system-on-chip synthesis

CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Static and dynamic temperature-aware scheduling for multiprocessor SoCs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Lifetime reliability-aware task allocation and scheduling for MPSoC platforms

Proceedings of the Conference on Design, Automation and Test in Europe
Towards no-cost adaptive MPSoC static schedules through exploitation of logical-to-physical core mapping latitude

Proceedings of the Conference on Design, Automation and Test in Europe
Pipelined data parallel task mapping/scheduling technique for MPSoC

Proceedings of the Conference on Design, Automation and Test in Europe

Online task remapping strategies for fault-tolerant Network-on-Chip multiprocessors

NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems

CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Scenario-based design flow for mapping streaming applications onto on-chip many-core systems

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Mapping on multi/many-core systems: survey of current and emerging trends

Proceedings of the 50th Annual Design Automation Conference
Reliability and performance optimization of pipelined real-time systems

Journal of Parallel and Distributed Computing
A system-level approach to adaptivity and fault-tolerance in NoC-based MPSoCs: The MADNESS project

Microprocessors & Microsystems
Energy-aware task mapping and scheduling for reliable embedded computing systems

ACM Transactions on Embedded Computing Systems (TECS) - Special Section ESFH'12, ESTIMedia'11 and Regular Papers
Failure-Aware Task Scheduling of Synchronous Data Flow Graphs Under Real-Time Constraints

Journal of Signal Processing Systems
MOMA: mapping of memory-intensive software-pipelined applications for systems with multiple memory controllers

Proceedings of the International Conference on Computer-Aided Design
Communication and migration energy aware task mapping for reliable multiprocessor systems

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the continuous scaling of semiconductor technology, the life-time of circuit is decreasing so that processor failure becomes an important issue in MPSoC design. A software solution to tolerate run-time processor failure is to migrate tasks from the failed processors to the live processors when failure occurs. Previous works on run-time task migration usually aim to minimize the migration overhead with or without a given latency constraint. For streaming applications, however, it is more important to minimize the throughput degradation than the migration overhead or the latency. Hence, we propose a task remapping technique to minimize the throughput degradation assuming that the migration overhead can be amortized safely. The target multi-core system assumed in this paper consists of processor pools and each pool consists of homogeneous processors. The proposed technique is based on an intensive compile-time analysis for all possible failure scenarios. It involves the following steps; 1) Determine the static mapping of tasks onto the live processors, aiming to minimize the throughput degradation: 2) Find an optimal processor-to-processor mapping to minimize the task migration overhead: and 3) Store the resultant task remapping information that includes task mapping and processor-to-processor mapping results. Since the task remapping information is pre-computed at compile-time for all possible failure scenarios, it should be efficiently represented and stored. At run-time, we simply remap the tasks following the compile-time decision. We examine the scalability of the proposed technique on both space and run-time overhead for compile-time analysis varying the number of failed processors. Through intensive experiments, we show that the proposed technique outperforms the previous works with respect to application throughput.