Generative communication in Linda
ACM Transactions on Programming Languages and Systems (TOPLAS)
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Sabotage-tolerance mechanisms for volunteer computing systems
Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
XtremWeb: A Generic Global Computing System
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Farsite: federated, available, and reliable storage for an incompletely trusted environment
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
The Internet Backplane Protocol: A Study in Resource Sharing
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Grid Datafarm Architecture for Petascale Data Intensive Computing
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Stork: Making Data Placement a First Class Citizen in the Grid
ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
From Sandbox to Playground: Dynamic Virtual Environments in the Grid
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Optimal File-Bundle Caching Algorithms for Data-Grids
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Kosha: A Peer-to-Peer Enhancement for the Network File System
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A Metadata Catalog Service for Data Intensive Applications
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The design and implementation of Grid database services in OGSA-DAI: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
FreeLoader: Scavenging Desktop Storage Resources for Scientific Data
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Comprehensive view of a live network coding P2P system
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed
International Journal of High Performance Computing Applications
BitDew: a programmable environment for large-scale data management and distribution
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
On collaborative content distribution using multi-message gossip
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Replica based distributed metadata management in grid environment
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
GatorShare: a file system framework for high-throughput data management
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
P2P-MapReduce: Parallel data processing in dynamic Cloud environments
Journal of Computer and System Sciences
Hi-index | 0.00 |
Desktop Grids use the computing, network and storage resources from idle desktop PCs distributed over multiple-LANs or the Internet to compute a large variety of resource-demanding distributed applications. While these applications need to access, compute, store and circulate large volumes of data, little attention has been paid to data management in such large-scale, dynamic, heterogeneous, volatile and highly distributed Grids. In most cases, data management relies on ad hoc solutions, and providing a general approach is still a challenging issue. A new class of data management service is desirable to deal with such a variety of file transfer protocols than client/server, P2P or the new and emerging Cloud storage service. To address this problem, we propose the BitDew framework, a programmable environment for automatic and transparent data management on computational Desktop Grids. This paper describes the BitDew programming interface, its architecture, and the performance evaluation of its runtime components. BitDew relies on a specific set of metadata to drive key data management operations, namely life cycle, distribution, placement, replication and fault tolerance with a high level of abstraction. The BitDew runtime environment is a flexible distributed service architecture that integrates modular P2P components such as DHTs (Distributed Hash Tables) for a Distributed Data Catalog and collaborative transport protocols for data distribution. We explain how to plug-in new or existing protocols and we give evidence of the versatility of the framework by implementing HTTP, FTP and BitTorrent protocols and access to the Amazon S3 and IBP Wide Area Storage. We describe the mechanisms used to provide asynchronous and reliable multi-protocols transfers. Through several examples, we describe how application programmers and BitDew users can exploit BitDew's features. We report on performance evaluation using micro-benchmarks, various usage scenarios and data-intense bioinformatics application, both in the Grid context and on the Internet. The performance evaluation demonstrates that the high level of abstraction and transparency is obtained with a reasonable overhead, while offering the benefit of scalability, performance and fault tolerance with little programming cost.