Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Resource containers: a new facility for resource management in server systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
The persistent relevance of the local operating system to global applications
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
An end-to-end approach to globally scalable network storage
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Flexibility, Manageability, and Performance in a Grid Storage Appliance
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
The Grid2003 Production Grid: Principles and Practice
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Scale and performance in the Denali isolation kernel
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
FreeLoader: Scavenging Desktop Storage Resources for Scientific Data
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Operating system support for planetary-scale network services
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
A user-mode port of the linux kernel
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
IEEE Communications Magazine
Optimizing workflow data footprint
Scientific Programming - Dynamic Computational Workflows: Discovery, Optimization and Scheduling
Efficient access to many samall files in a filesystem for grid computing
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Hi-index | 0.00 |
Shared temporary storage space is often the constraining resource for clusters that serve as execution nodes in wide-area distributed systems. At least one large national-scale computing grid has reported a failure rate of as high as thirty percent of submitted jobs, often due to accidentally filled shared storage spaces. Previous systems have attacked this problem by adding space allocation to the distributed system interface. However, these allocations are not enforced at the filesystem level, and thus unexpected or unaccounted uses of storage may cause the system to fail. By adding an inexpensive allocation mechanism to the operating system, we may improve the robustness of such systems at minimal cost. In this paper, we describe an abstract model of space allocation in the file system and explore three implementations of the model: a user-level library, a recursive loopback filesystem, and a modified kernel filesystem. We evaluate the performance and completeness of these implementations and demonstrate that kernel support is essential to keeping the overhead low. Finally, we demonstrate empirically that a cluster under heavy filesystem load can be made more robust by adding allocations to the filesystem.