Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
JOLIE: a Java Orchestration Language Interpreter Engine
Electronic Notes in Theoretical Computer Science (ENTCS)
A data-driven workflow language for grids based on array programming principles
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Scientific Workflow Development Using Both Visual and Script-Based Representation
SERVICES '10 Proceedings of the 2010 6th World Congress on Services
Semi-supervised learning by disagreement
Knowledge and Information Systems
Data Sharing Options for Scientific Workflows on Amazon EC2
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Enabling cloud interoperability with COMPSs
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
ClowdFlows: a cloud based scientific workflow platform
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Service-Oriented Distributed Knowledge Discovery
Service-Oriented Distributed Knowledge Discovery
Using clouds for scalable knowledge discovery applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Hi-index | 0.00 |
Data analysis workflows are often composed by many concurrent and compute-intensive tasks that can be efficiently executed only on scalable computing infrastructures, such as HPC systems, Grids and Cloud platforms. The use of Cloud services for the scalable execution of data analysis workflows is the key feature of the Data Mining Cloud Framework (DMCF), which provides a Web interface to develop data analysis applications using a visual workflow formalism. In this paper we describe how we extended DMCF to support also the design and execution of script-based data analysis workflows on Clouds. We introduce a workflow language, named JS4Cloud, that extends JavaScript to support the implementation of Cloud-based data analysis tasks and the handling of data on the Cloud. We also describe how data analysis workflows programmed through JS4Cloud are processed by DMCF to make parallelism explicit and to enable their scalable execution on Clouds. Finally, we present a data analysis application developed with JS4Cloud, and the performance results obtained executing the application with DMCF on the Windows Azure platform.