The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking
International Journal of Computational Science and Engineering
Composing and executing parallel data-flow graphs with shell pipes
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
The case for a versatile storage system
ACM SIGOPS Operating Systems Review
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
MapReduce in the Clouds for Science
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Scale and concurrency of GIGA+: file system directories with millions of files
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
DAGuE: A generic distributed DAG engine for High Performance Computing
Parallel Computing
Swift: A language for distributed parallel scripting
Parallel Computing
A Workflow-Aware Storage System: An Opportunity Study
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Design and analysis of data management in scalable parallel scripting
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
MARISSA: MApReduce Implementation for Streaming Science Applications
E-SCIENCE '12 Proceedings of the 2012 IEEE 8th International Conference on E-Science (e-Science)
ZHT: A Light-Weight Reliable Persistent Dynamic Scalable Zero-Hop Distributed Hash Table
IPDPS '13 Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
Hi-index | 0.00 |
Scripting is often used in science to create applications via the composition of existing programs. Parallel scripting systems allow the creation of such applications, but each system introduces the need to adopt a somewhat specialized programming model. We present an alternative scripting approach, AMFS Shell, that lets programmers express parallel scripting applications via minor extensions to existing sequential scripting languages, such as Bash, and then execute them in-memory on large-scale computers. We define a small set of commands between the scripts and a parallel scripting runtime system, so that programmers can compose their scripts in a familiar scripting language. The underlying AMFS implements both collective (fast file movement) and functional (transformation based on content) file management. Tasks are handled by AMFS's built-in execution engine. AMFS Shell is expressive enough for a wide range of applications, and the framework can run such applications efficiently on large-scale computers.