Andrew: a distributed personal computing environment
Communications of the ACM - The MIT Press scientific computation series
Caching in the Sprite network file system
ACM Transactions on Computer Systems (TOCS)
Serverless network file systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Efficient cooperative caching using hints
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Squirrel: a decentralized peer-to-peer web cache
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Ivy: a read/write peer-to-peer file system
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
PAST: A Large-Scale, Persistent Peer-to-Peer Storage Utility
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Lightweight probabilistic broadcast
ACM Transactions on Computer Systems (TOCS)
Distributed caching with memcached
Linux Journal
Low Diameter Interconnections for Routing in High-Performance Parallel Systems
IEEE Transactions on Computers
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
The promise, and limitations, of gossip protocols
ACM SIGOPS Operating Systems Review - Gossip-based computer networking
Scalaris: reliable transactional p2p key/value store
Proceedings of the 7th ACM SIGPLAN workshop on ERLANG
Rapid almost-complete broadcasting in faulty networks
Theoretical Computer Science
Cassandra: structured storage system on a P2P network
Proceedings of the 28th ACM symposium on Principles of distributed computing
HTC scientific computing in a distributed cloud environment
Proceedings of the 4th ACM workshop on Scientific cloud computing
Hi-index | 0.00 |
The computing facilities used to process data for the experiments at the Large Hadron Collider at CERN are scattered around the world. The embarrassingly parallel workload allows for use of various computing resources, such as Grid sites of the Worldwide LHC Computing Grid, commercial and institutional cloud resources, as well as individual home PCs in "volunteer clouds". Unlike data, the experiment software cannot be easily split into small work units. Efficient delivery of the complex and frequently changing experiment software is a crucial step to harness heterogeneous resources. Here we present an approach to deliver software on demand using a scalable hierarchy of standard HTTP caches. We show how to tackle this problem by pre-processing software into content-addressable storage. On the worker nodes, we use a specially crafted file system that ensures data integrity and provides fault-tolerance. We show performance figures from large-scale deployment. For the most common case of computing clusters with 10 to 1000 worker nodes, we present a novel state dissemination protocol to support a fully decentralized and distributed memory cache.