Condor-G: A Computation Management Agent for Multi-Institutional Grids
Cluster Computing
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
The portable batch scheduler and the maui scheduler on linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform
ACM SIGOPS Operating Systems Review
Falkon: a Fast and Light-weight tasK executiON framework
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Accelerating large-scale data exploration through data diffusion
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
Scientific Workflow Systems for 21st Century, New Bottle or New Wine?
SERVICES '08 Proceedings of the 2008 IEEE Congress on Services - Part I
High throughput grid computing with an IBM Blue Gene/L
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Blue Gene/L programming and operating environment
IBM Journal of Research and Development
Globus toolkit version 4: software for service-oriented systems
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
The quest for scalable support of data-intensive workloads in distributed systems
Proceedings of the 18th ACM international symposium on High performance distributed computing
POGGI: Puzzle-Based Online Games on Grid Infrastructures
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Lessons learned from a year's worth of benchmarks of large data clouds
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
OddCI: on-demand distributed computing infrastructure
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Ensemble dispatching on an IBM Blue Gene/L for a bioinformatics knowledge environment
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Case studies in storage access by loosely coupled petascale applications
Proceedings of the 4th Annual Workshop on Petascale Data Storage
Middleware support for many-task computing
Cluster Computing
Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay
Future Generation Computer Systems
Towards jungle computing with Ibis/Constellation
Proceedings of the 2011 workshop on Dynamic distributed data-intensive applications, programming abstractions, and systems
Making a case for distributed file systems at Exascale
Proceedings of the third international workshop on Large-scale system and application performance
Exploring distributed hash tables in HighEnd computing
ACM SIGMETRICS Performance Evaluation Review
Swift: A language for distributed parallel scripting
Parallel Computing
Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows
Future Generation Computer Systems
Autonomic load balancing mechanisms in the P2P desktop grid
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascale
Proceedings of the High Performance Computing Symposium
Multiple objective scheduling of HPC workloads through dynamic prioritization
Proceedings of the High Performance Computing Symposium
IKAROS: An HTTP-Based Distributed File System, for Low Consumption & Low Specification Devices
Journal of Grid Computing
JETS: Language and System Support for Many-Parallel-Task Workflows
Journal of Grid Computing
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Hi-index | 0.00 |
We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new---and potentially far larger---class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.