Energy-aware I/O optimization for checkpoint and restart on a NAND flash memory system

  • Authors:
  • Takafumi Saito;Kento Sato;Hitoshi Sato;Satoshi Matsuoka

  • Affiliations:
  • Tokyo Institute of Technology, Tokyo, Japan;Tokyo Institute of Technology, Tokyo, Japan;Tokyo Institute of Technology, Tokyo, Japan;Tokyo Institute of Technology, Tokyo, Japan

  • Venue:
  • Proceedings of the 3rd Workshop on Fault-tolerance for HPC at extreme scale
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Both energy efficiency and system reliability are significant concerns towards exa-scale high-performance computing. In such large HPC systems, applications are required to conduct massive I/O operations to local storage devices (e.g. a NAND flash memory) for scalable checkpoint and restart. However, checkpoint/restart can use a large portion of runtime, and consumes enormous energy by non-I/O subsystems, such as CPU and memory. Thus, energy-aware optimization, including I/O operations to storage, is required for checkpoint/restart. In this paper, we present a profile-based I/O optimization technique for NAND flash memory devices based on Markov model for checkpoint/restart. The results based on performance studies show that our profile lookup approach can save 4.1% of energy consumption in an application execution with checkpoint/restart. Especially, our approach improves the energy consumption of write operations by 67.4% and read operations by 40.2% on a PCIe-attached NAND flash memory device.