DPCT: distributed parity cache table for redundant parallel file system

  • Authors:
  • Sheng-Kai Hung;Yarsun Hsu

  • Affiliations:
  • Department of Electrical Engineering, National Tsing-Hua University, HsinChu, Taiwan;Department of Electrical Engineering, National Tsing-Hua University, HsinChu, Taiwan

  • Venue:
  • HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using parity information to protect data from loss in a parallel file system is a straightforward and cost-effective method. However, the “small-write” phenomenon can lead to poor write performance. This is still true in the distributed paradigm even when file system cache is used. The local file system knows nothing about a stripe and thus can not benefit from the related blocks of a stripe. We propose a distributed parity cache table (DPCT) which knows the related blocks of a stripe and can use them to improve the performance of parity calculation and parity updating. This high level cache can benefit from previous reads and can aggregate small writes to improve the overall performance. We implement this mechanism in our reliable parallel file system (RPFS). The experimental results show that both read and write performance can be improved with DPCT support. The improvement comes from the fact that we can reduce the number of disk accesses by DPCT. This matches our quantitative analysis which shows that the number of disk accesses can be reduced from N to N(1–H), where N is the number of I/O nodes and H is the DPCT hit ratio.