Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform

  • Authors:
  • Roberto R. Expósito;Guillermo L. Taboada;Sabela Ramos;Jorge González-Domínguez;Juan Touriño;Ramón Doallo

  • Affiliations:
  • Department of Electronics and Systems, University of A Coruña, A Coruña, Spain;Department of Electronics and Systems, University of A Coruña, A Coruña, Spain;Department of Electronics and Systems, University of A Coruña, A Coruña, Spain;Department of Electronics and Systems, University of A Coruña, A Coruña, Spain;Department of Electronics and Systems, University of A Coruña, A Coruña, Spain;Department of Electronics and Systems, University of A Coruña, A Coruña, Spain

  • Venue:
  • Journal of Grid Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem using the Amazon EC2 Cluster Compute platform and the recent High I/O instance type, to determine its suitability for I/O-intensive applications. The evaluation has been carried out at different layers using representative benchmarks in order to evaluate the low-level cloud storage devices available in Amazon EC2, ephemeral disks and Elastic Block Store (EBS) volumes, both on local and distributed file systems. In addition, several I/O interfaces (POSIX, MPI-IO and HDF5) commonly used by scientific workloads have also been assessed. Furthermore, the scalability of a representative parallel I/O code has also been analyzed at the application level, taking into account both performance and cost metrics. The analysis of the experimental results has shown that available cloud storage devices can have different performance characteristics and usage constraints. Our comprehensive evaluation can help scientists to increase significantly (up to several times) the performance of I/O-intensive applications in Amazon EC2 cloud. An example of optimal configuration that can maximize I/O performance in this cloud is the use of a RAID 0 of 2 ephemeral disks, TCP with 9,000 bytes MTU, NFS async and MPI-IO on the High I/O instance type, which provides ephemeral disks backed by Solid State Drive (SSD) technology.