Prototyping a fault-tolerant multiprocessor SoC with run-time fault recovery

  • Authors:
  • Xinping Zhu;Wei Qin

  • Affiliations:
  • Northeastern University, Boston, MA;Boston University, Boston, MA

  • Venue:
  • Proceedings of the 43rd annual Design Automation Conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern integrated circuits (ICs) are becoming increasingly complex. The complexity makes it difficult to design, manufacture and integrate these high performance ICs. The advent of multiprocessor Systems-on-chips (SoCs) makes it even more challenging for programmers to utilize the full potential of the computation resources on the chips. In the mean time, the complexity of the chip design creates new reliability challenges. As a result, chip designers and users cannot fully exploit the tremendous silicon resources on the chip. This research proposes a prototype which is composed of a fault tolerantmultiprocessor SoC and a coupled single program, multiple data (SPMD) programming framework. We use a SystemC based modeling and simulation environment to design and analyze this prototype. Our analysis shows that this prototype as a reliable computing platform constructed from the potentially unreliable chip resources, thus protecting the previous investment of hardware and software designs. Moreover, the promising application-driven simulation results shed light on the potential of a scalable and reliable multiprocessing computing platform for a wide range of mission-critical applications.