Scalable repositories for virtual clusters

Authors:
Paolo Anedda;Simone Leo;Massimo Gaggero;Gianluigi Zanetti
Affiliations:
CRS4 Distributed Computing Group, Pula, Italy;CRS4 Distributed Computing Group, Pula, Italy;CRS4 Distributed Computing Group, Pula, Italy;CRS4 Distributed Computing Group, Pula, Italy
Venue:
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Year:
2009

Citing 13
Cited 1

GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
Sun Grid Engine: Towards Creating a Compute Power Grid

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Xen and the art of virtualization

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Maestro-VC: A Paravirtualized Execution Environment for Secure On-Demand Cluster Computing

CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Virtual Clusters for Grid Communities

CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Virtual Clusters on the Fly - Fast, Scalable, and Flexible Installation

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Drowning in data: digital library architecture to support scientific use of embedded sensor networks

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
The definitive guide to the xen hypervisor

The definitive guide to the xen hypervisor
Bioimage informatics

Bioinformatics
Cloud analytics: do we really need to reinvent the storage stack?

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Virtual workspaces in the grid

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

DISCOVERY, beyond the clouds

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

For a large class of scientific data analysis applications it is becoming important, due to the sheer size of datasets, to have the option to perform the analysis directly where the data are stored, rather than on remote computational clusters. A possible strategy is the use of virtual clusters, thus guaranteeing a high degree of isolation from the underlying physical computational structure, and a very compact initial description. Deploying, saving and restoring HPC dedicated virtual clusters introduces, however, a different class of requirements on the virtual machines managing infrastructure, in particular for what concerns storage I/O requirements, whose scalability boundaries are easily reached. Here we discuss an alternative approach based on a storage model that leverages the WORM (write once, read many) character of the data used by VM management to increase, in a scalable way, the aggregate data bandwidth available to virtual cluster level operations and provide preliminary results indicating that it is a viable solution.