Hiding periodic i/o costs in parallel applications

  • Authors:
  • Marianne Winslett;Xiaosong Ma

  • Affiliations:
  • -;-

  • Venue:
  • Hiding periodic i/o costs in parallel applications
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Long-running simulation applications routinely need to save intermediate results for future analysis, and save other data necessary for restarting after crashes or code modifications. Visualization applications, on the other hand, need to repeatedly read in and process the intermediate results generated by the simulations. Such periodic data transfer between main memory and secondary storage is becoming more of a problem for today's large-scale scientific applications: the imbalance between the performance and scalability of the processors and the secondary storage systems has been enlarging during the past decades and this trend is expected to continue. Further, the complexity of applications and their data sets has been increasing, making it more and more difficult for application developers to use existing parallel I/O interfaces. This thesis addresses the above problems by taking an approach that is different from most previous research on parallel I/O. Instead of improving the actual data transfer rate, we strive to hide the high I/O costs from the application's point of view by maximizing the overlap between I/O and other tasks. Our approaches take advantage of the I/O operations' periodicity, scientific codes' specific I/O semantic requirements, and the existence of idle system resources. To improve the apparent performance of periodic I/O operations, we present several novel techniques for application level buffering and prefetching. To serve the I/O needs of large-scale, complex applications, we show how to incorporate the above mentioned I/O performance optimizations into existing parallel I/O libraries for simulations and into a new general-purpose data management facility that we created for visualization applications. We evaluated our proposed I/O optimizations, namely active buffering and the GODIVA framework, with both synthetic benchmarks and real-world applications, including simulations and visualization tools. The performance study shows that our proposed techniques can significantly reduce the application-visible periodic I/O cost. Further, our experience of deploying these techniques in state-of-the-art simulation and visualization codes demonstrates that with careful design, our techniques for hiding I/O costs can be organically combined with mechanisms for performing adaptive performance optimization, application self-configuration, and effective data management.