PonD: dynamic creation of HTC pool on demand using a decentralized resource discovery system

Authors:
Kyungyong Lee;David Wolinsky;Renato J. Figueiredo
Affiliations:
University of Florida, Gainesville, FL, USA;Yale University, New Haven, CT, USA;University of Florida, Gainesville, FL, USA
Venue:
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Year:
2012

Citing 30
Cited 2

A worldwide flock of Condors: load sharing among workstation clusters

Future Generation Computer Systems - Special issue: resource management in distributed systems
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
On Fully Decentralized Resource Discovery in Grid Environments

GRID '01 Proceedings of the Second International Workshop on Grid Computing
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SCRIBE: The Design of a Large-Scale Event Notification Infrastructure

NGC '01 Proceedings of the Third International COST264 Workshop on Networked Group Communication
XtremWeb: A Generic Global Computing System

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Matchmaking: Distributed Resource Management for High Throughput Computing

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Scalable, Efficient Range Queries for Grid Information Services

P2P '02 Proceedings of the Second International Conference on Peer-to-Peer Computing
Condor-G: A Computation Management Agent for Multi-Institutional Grids

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
PlanetLab: an overlay testbed for broad-coverage services

ACM SIGCOMM Computer Communication Review
Mercury: supporting scalable multi-attribute range queries

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
BOINC: A System for Public-Resource Computing and Storage

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
A self-organizing flock of Condors

Journal of Parallel and Distributed Computing
Using content-addressable networks for load balancing in desktop grids

Proceedings of the 16th international symposium on High performance distributed computing
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Scalable blind search and broadcasting over Distributed Hash Tables

Computer Communications
Falkon: a Fast and Light-weight tasK executiON framework

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Squid: Enabling search in DHT-based systems

Journal of Parallel and Distributed Computing
The Grid Workloads Archive

Future Generation Computer Systems
Design and implementation trade-offs for wide-area resource discovery

ACM Transactions on Internet Technology (TOIT)
Efficient Range Query Processing in Peer-to-Peer Systems

IEEE Transactions on Knowledge and Data Engineering
BonjourGrid: Orchestration of multi-instances of grid middlewares on institutional Desktop Grids

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A decentralized and fault-tolerant Desktop Grid system for distributed applications

Concurrency and Computation: Practice & Experience - Advanced Scheduling Strategies and Grid Programming Environments
A middleware for job distribution in peer-to-peer networks

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Performance analysis of dynamic workflow scheduling in multicluster grids

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
IP over P2P: enabling self-configuring virtual IP networks for grid computing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Resource Discovery and Scheduling in Unstructured Peer-to-Peer Desktop Grids

ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Experiences with self-organizing, decentralized grids using the grid appliance

Proceedings of the 20th international symposium on High performance distributed computing
Parallel Processing Framework on a P2P System Using Map and Reduce Primitives

IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
A Highly Scalable Decentralized Scheduler of Tasks with Deadlines

GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing

Semantic agent system for automatic mobilization of distributed and heterogeneous resources

Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
MatchTree: Flexible, scalable, and fault-tolerant wide-area resource discovery with distributed matchmaking and aggregation

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

High Throughput Computing (HTC) platforms aggregate heterogeneous resources to provide vast amounts of computing power over a long period of time. Typical HTC systems, such as Condor and BOINC, rely on central managers for resource discovery and scheduling. While this approach simplifies deployment, it requires careful system configuration and management to ensure high availability and scalability. In this paper, we present a novel approach that integrates a self-organizing P2P overlay for scalable and timely discovery of resources with unmodified client/server job scheduling middleware in order to create HTC virtual resource Pools on Demand (PonD). This approach decouples resource discovery and scheduling from job execution/monitoring - a job submission dynamically generates an HTC platform based upon resources discovered through match-making from a large "sea" of resources in the P2P overlay and forms a "PonD" capable of leveraging unmodified HTC middleware for job execution and monitoring. We show that job scheduling time of our approach scales with O(log N), where N is the number of resources in a pool, through first-order analytical models and large-scale simulation results. To verify the practicality of PonD, we have implemented a prototype using Condor (called C-PonD), a structured P2P overlay, and a PonD creation module. Experimental results with the prototype in two WAN environments (PlanetLab and the FutureGrid cloud computing testbed) demonstrates the utility of C-PonD as a HTC approach without relying on a central repository for maintaining all resource information. Though the prototype is based on Condor, the decoupled nature of the system components - decentralized resource discovery, PonD creation, job execution/monitoring - is generally applicable to other grid computing middleware systems.