Improving the yield of NoC-based systems through fault diagnosis and adaptive routing

  • Authors:
  • Caroline Concatto;João Almeida;Guilherme Fachini;Marcos Hervé;Fernanda Kastensmidt;írika Cota;Marcelo Lubaszewski

  • Affiliations:
  • Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Informática-Campus do Vale, Av. Bento Gonçalves 9500-Bloco IV, Caixa Postal 15064, 91501-970, Porto Alegre - RS, Brazil;Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Informática-Campus do Vale, Av. Bento Gonçalves 9500-Bloco IV, Caixa Postal 15064, 91501-970, Porto Alegre - RS, Brazil;Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Informática-Campus do Vale, Av. Bento Gonçalves 9500-Bloco IV, Caixa Postal 15064, 91501-970, Porto Alegre - RS, Brazil;CEITEC S.A., Estrada João de Oliveira Remião, 777, 91550-000, Porto Alegre - RS, Brazil;Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Informática-Campus do Vale, Av. Bento Gonçalves 9500-Bloco IV, Caixa Postal 15064, 91501-970, Porto Alegre - RS, Brazil;Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Informática-Campus do Vale, Av. Bento Gonçalves 9500-Bloco IV, Caixa Postal 15064, 91501-970, Porto Alegre - RS, Brazil;Universidade Federal do Rio Grande do Sul (UFRGS), Departamento de Engenharia Eletrica, Av. Osvaldo Aranha, 103 - Bairro Bom Fim, 90035-190, Porto Alegre - RS, Brazil

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an effective and low cost method to increase the yield and the lifetime of torus NoCs. The method consists in detecting and diagnosing NoC interconnect faults using BIST structures and activating alternative paths for the faulty links. Alternative paths use the inherent redundancy of the torus topology, thus leading to minimal performance, area, and power overhead. We assume an extended interconnect fault model comprising stuck-at and pairwise shorts within a single link or between any two links in the network. Experimental results for a 3x3 NoC show that the proposed approach can correctly diagnose 93% of all possible interconnect faults and can mitigate 42% of those faults (representing 94.4% of the solvable faults) with a worst case performance penalty of 8% and 1% of area overhead. We also demonstrate the scalability of the method by presenting its application to larger NoCs.