On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
A case for high performance computing with virtual machines
Proceedings of the 20th annual international conference on Supercomputing
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Xen-Based HPC: A Parallel I/O Perspective
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Amazon S3 for science grids: a viable solution?
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
The cost of doing science on the cloud: the Montage example
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Can cloud computing reach the top500?
Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
High-Performance Cloud Computing: A View of Scientific Applications
ISPAN '09 Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks
I/O virtualization bottlenecks in cloud computing today
WIOV'10 Proceedings of the 2nd conference on I/O virtualization
Recommendations for Virtualization Technologies in High Performance Computing
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
MapReduce in the Clouds for Science
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
State of the Practice Reports
Paravirtualization for HPC systems
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
I/O performance of virtualized cloud environments
Proceedings of the second international workshop on Data intensive computing in the clouds
One optimized I/O configuration per HPC application: leveraging the configurability of cloud
Proceedings of the Second Asia-Pacific Workshop on Systems
An Evaluation of the Cost and Performance of Scientific Workflows on Amazon EC2
Journal of Grid Computing
iMapReduce: A Distributed Computing Framework for Iterative Computation
Journal of Grid Computing
MapReduce Workload Modeling with Statistical Approach
Journal of Grid Computing
Evaluating Interconnect and Virtualization Performance forHigh Performance Computing
ACM SIGMETRICS Performance Evaluation Review
Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure
Journal of Grid Computing
A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds
Journal of Grid Computing
Performance analysis of HPC applications in the cloud
Future Generation Computer Systems
High performance cloud computing
Future Generation Computer Systems
Enhancing Federated Cloud Management with an Integrated Service Monitoring Approach
Journal of Grid Computing
Hi-index | 0.00 |
Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem using the Amazon EC2 Cluster Compute platform and the recent High I/O instance type, to determine its suitability for I/O-intensive applications. The evaluation has been carried out at different layers using representative benchmarks in order to evaluate the low-level cloud storage devices available in Amazon EC2, ephemeral disks and Elastic Block Store (EBS) volumes, both on local and distributed file systems. In addition, several I/O interfaces (POSIX, MPI-IO and HDF5) commonly used by scientific workloads have also been assessed. Furthermore, the scalability of a representative parallel I/O code has also been analyzed at the application level, taking into account both performance and cost metrics. The analysis of the experimental results has shown that available cloud storage devices can have different performance characteristics and usage constraints. Our comprehensive evaluation can help scientists to increase significantly (up to several times) the performance of I/O-intensive applications in Amazon EC2 cloud. An example of optimal configuration that can maximize I/O performance in this cloud is the use of a RAID 0 of 2 ephemeral disks, TCP with 9,000 bytes MTU, NFS async and MPI-IO on the High I/O instance type, which provides ephemeral disks backed by Solid State Drive (SSD) technology.