Optimizing fastquery performance on lustre file system

  • Authors:
  • Kuan-Wu Lin;Surendra Byna;Jerry Chou;Kesheng Wu

  • Affiliations:
  • National Tsing Hua Univeristy, Hsinchu, Taiwan;Lawrence Berkeley National Laboratory, Berkeley, CA;National Tsing Hua Univeristy, Hsinchu, Taiwan;Lawrence Berkeley National Laboratory, Berkeley, CA

  • Venue:
  • Proceedings of the 25th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

FastQuery is a parallel indexing and querying system we developed for accelerating analysis and visualization of scientific data. We have applied it to a wide variety of HPC applications and demonstrated its capability and scalability using a petascale trillion-particle simulation in our previous work. Yet, through our experience, we found that performance of reading and writing data with FastQuery, like many other HPC applications, could be significantly affected by various tunable parameters throughout the parallel I/O stack. In this paper, we describe our success in tuning the performance of FastQuery on a Lustre parallel file system. We study and analyze the impact of parameters and tunable settings at file system, MPI-IO library, and HDF5 library levels of the I/O stack. We demonstrate that a combined optimization strategy is able to improve performance and I/O bandwidth of FastQuery significantly. In our tests with a trillion-particle dataset, the time to index the dataset reduced by more than one half.