Cooperative recovery of distributed storage systems from multiple losses with network coding

  • Authors:
  • Yuchong Hu;Yinlong Xu;Xiaozhao Wang;Cheng Zhan;Pei Li

  • Affiliations:
  • School of Computer Science & Technology, University of Science & Technology of China and Key Laboratory on High Performance Computing, Anhui Province;School of Computer Science & Technology, University of Science & Technology of China and Key Laboratory on High Performance Computing, Anhui Province;School of Computer Science & Technology, University of Science & Technology of China and Key Laboratory on High Performance Computing, Anhui Province;School of Computer Science & Technology, University of Science & Technology of China and Key Laboratory on High Performance Computing, Anhui Province;School of Computer Science & Technology, University of Science & Technology of China and Key Laboratory on High Performance Computing, Anhui Province

  • Venue:
  • IEEE Journal on Selected Areas in Communications
  • Year:
  • 2010

Quantified Score

Hi-index 0.07

Visualization

Abstract

This paper studies the recovery from multiple node failures in distributed storage systems. We design a mutually cooperative recovery (MCR) mechanism for multiple node failures. Via a cut-based analysis of the information flow graph, we obtain a lower bound of maintenance bandwidth based on MCR. For MCR, we also propose a transmission scheme and design a linear network coding scheme based on (n, k) strong-MDS code, which is a generalization of (n, k) MDS code. We prove that the maintenance bandwidth based on our transmission and coding schemes matches the lower bound, so the lower bound is tight and the transmission scheme and coding scheme for MCR are optimal. We also give numerical comparisons of MCR with other redundancy recovery mechanisms in storage cost and maintenance bandwidth to show the advantage of MCR.