Improved Read Performance in a Cost-Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS)

  • Authors:
  • Yifeng Zhu;Hong Jiang;Xiao Qin;Dan Feng;David R. Swanson

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the ever-widening performance gap betweenprocessors and disks, I/O operations tend to become themajor performance bottleneck of data-intensiveapplications on modern clusters. If all the existing diskson the nodes of a cluster are connected together toestablish high performance parallel storage systems, thecluster's overall performance can be boosted at noadditional cost. CEFT-PVFS (a RAID 10 style parallelfile system that extends the original PVFS), as one suchsystem, divides the cluster nodes into two groups, stripesthe data across one group in a round-robin fashion, andthen duplicates the same data to the other group toprovide storage service of high performance and highreliability. Previous research has shown that the systemreliability is improved by a factor of more than 40 withmirroring while maintaining a comparable writeperformance. This paper presents another benefit ofCEFT-PVFS in which the aggregate peak readperformance can be improved by as much as 100% overthat of the original PVFS by exploiting the increasedparallelism.Additionally, when the data servers, which typicallyare also computational nodes in a cluster environment,are loaded in an unbalanced way by applicationsrunning in the cluster, the read performance of PVFSwill be degraded significantly. On the contrary, in theCEFT-PVFS, a heavily loaded data server can beskipped and all the desired data is read from itsmirroring node. Thus the performance will not beaffected unless both the server node and its mirroringnode are heavily loaded.