MicroC/OS-II: the real-time kernel
MicroC/OS-II: the real-time kernel
Modern Operating Systems
MPI: The Complete Reference
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
32-Bit Cyclic Redundancy Codes for Internet Applications
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Debugging HW/SW interface for MPSoC: video encoder system design case study
Proceedings of the 41st annual Design Automation Conference
Design, Implementation and Performance of Fault-Tolerant Message Passing Interface (MPI)
HPCASIA '04 Proceedings of the High Performance Computing and Grid in Asia Pacific Region, Seventh International Conference
Fault-tolerant solutions for a MPI compute intensive application
PDP '07 Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing
Hi-index | 0.00 |
In the future Multi-Processor System-on-Chip (MPSoC) platforms are becoming more vulnerable to transient and intermittent faults due to physical level problems of VLSI technologies. This sets new requirements to the fault-tolerance of the messaging layer software which applications use for communication, because the faults make the operation of the Network-on-Chip (NoC) hardware of the MPSoCs less reliable. This paper presents Micron Message-Passing (MMP) Protocol which is a light-weight protocol designed for improving the fault tolerance of the messaging layer of the MPSoCs where Micronmesh NoC is used. Its fault-tolerance is implemented by watchdog timers and Cyclic Redundancy Checks (CRC) which are usable for detecting packet losses, communication deadlocks, and bit errors. These three functionalities are necessary, because without them the software executed on the MPSoCs is not able to detect the faults and recover from them. This paper presents also how the MMP Protocol can be used for implementing applications which are able to recover from communication faults.