Fault-tolerant communication over micronmesh NOC with micron message-passing protocol

  • Authors:
  • Heikki Kariniemi;Jari Nurmi

  • Affiliations:
  • Department of Computer Systems, Tampere University of Technology, Tampere, Finland;Department of Computer Systems, Tampere University of Technology, Tampere, Finland

  • Venue:
  • SOC'09 Proceedings of the 11th international conference on System-on-chip
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the future Multi-Processor System-on-Chip (MPSoC) platforms are becoming more vulnerable to transient and intermittent faults due to physical level problems of VLSI technologies. This sets new requirements to the fault-tolerance of the messaging layer software which applications use for communication, because the faults make the operation of the Network-on-Chip (NoC) hardware of the MPSoCs less reliable. This paper presents Micron Message-Passing (MMP) Protocol which is a light-weight protocol designed for improving the fault tolerance of the messaging layer of the MPSoCs where Micronmesh NoC is used. Its fault-tolerance is implemented by watchdog timers and Cyclic Redundancy Checks (CRC) which are usable for detecting packet losses, communication deadlocks, and bit errors. These three functionalities are necessary, because without them the software executed on the MPSoCs is not able to detect the faults and recover from them. This paper presents also how the MMP Protocol can be used for implementing applications which are able to recover from communication faults.