A comparative experimental study of parallel file systems for large-scale data processing

  • Authors:
  • Zoe Sebepou;Kostas Magoutis;Manolis Marazakis;Angelos Bilas

  • Affiliations:
  • Institute of Computer Science, Foundation for Research and Technology - Hellas, Heraklion, Greece and Department of Computer Science, University of Crete, Greece;Institute of Computer Science, Foundation for Research and Technology - Hellas, Heraklion, Greece;Institute of Computer Science, Foundation for Research and Technology - Hellas, Heraklion, Greece;Institute of Computer Science, Foundation for Research and Technology - Hellas, Heraklion, Greece and Department of Computer Science, University of Crete, Greece

  • Venue:
  • LASCO'08 First USENIX Workshop on Large-Scale Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale scientific and business applications require data processing of ever-increasing amounts of data, fueling a demand for scalable parallel file systems comprising hundreds to thousands of disks. Modern parallel file system architectures however, span a large and complex design space. As a result, IT architects are faced with a challenge when deciding on the most appropriate parallel file system for a specific scientific or industrial application in a large-scale computing installation. Typically, the right choice depends on the characteristics of the application as well as the design assumptions built into a parallel file system. In this study, we take a close look at two prominent modern parallel file systems, PVFS2 and Lustre, and compare them experimentally on a range of benchmark-driven scenarios modeling specific real-world applications.