Services + Components = Data Intensive Scientific Workflow Applications with MeDICi

  • Authors:
  • Ian Gorton;Jared Chase;Adam Wynne;Justin Almquist;Alan Chappell

  • Affiliations:
  • Pacific Northwest National Lab, Richland, USA WA 99352;Pacific Northwest National Lab, Richland, USA WA 99352;Pacific Northwest National Lab, Richland, USA WA 99352;Pacific Northwest National Lab, Richland, USA WA 99352;Pacific Northwest National Lab, Richland, USA WA 99352

  • Venue:
  • CBSE '09 Proceedings of the 12th International Symposium on Component-Based Software Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientific applications are often structured as workflows that execute a series of distributed software modules to analyze large data sets. Such workflows are typically constructed using general-purpose scripting languages to coordinate the execution of the various modules and to exchange data sets between them. While such scripts provide a cost-effective approach for simple workflows, as the workflow structure becomes complex and evolves, the scripts quickly become complex and difficult to modify. This makes them a major barrier to easily and quickly deploying new algorithms and exploiting new, scalable hardware platforms. In this paper, we describe the MeDICi Workflow technology that is specifically designed to reduce the complexity of workflow application development, and to efficiently handle data intensive workflow applications. MeDICi integrates standard component-based and service-based technologies, and employs an efficient integration mechanism to ensure large data sets can be efficiently processed. We illustrate the use of MeDICi with a climate data processing example that we have built, and describe some of the new features we are creating to further enhance MeDICi Workflow applications.