GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Sun Grid Engine: Towards Creating a Compute Power Grid
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Maestro-VC: A Paravirtualized Execution Environment for Secure On-Demand Cluster Computing
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Virtual Clusters for Grid Communities
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Virtual Clusters on the Fly - Fast, Scalable, and Flexible Installation
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Drowning in data: digital library architecture to support scientific use of embedded sensor networks
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
The definitive guide to the xen hypervisor
The definitive guide to the xen hypervisor
Bioinformatics
Cloud analytics: do we really need to reinvent the storage stack?
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Virtual workspaces in the grid
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Hi-index | 0.00 |
For a large class of scientific data analysis applications it is becoming important, due to the sheer size of datasets, to have the option to perform the analysis directly where the data are stored, rather than on remote computational clusters. A possible strategy is the use of virtual clusters, thus guaranteeing a high degree of isolation from the underlying physical computational structure, and a very compact initial description. Deploying, saving and restoring HPC dedicated virtual clusters introduces, however, a different class of requirements on the virtual machines managing infrastructure, in particular for what concerns storage I/O requirements, whose scalability boundaries are easily reached. Here we discuss an alternative approach based on a storage model that leverages the WORM (write once, read many) character of the data used by VM management to increase, in a scalable way, the aggregate data bandwidth available to virtual cluster level operations and provide preliminary results indicating that it is a viable solution.