XML database support for distributed execution of data-intensive scientific workflows

  • Authors:
  • Shannon Hastings;Matheus Ribeiro;Stephen Langella;Scott Oster;Umit Catalyurek;Tony Pan;Kun Huang;Renato Ferreira;Joel Saltz;Tahsin Kurc

  • Affiliations:
  • The Ohio State University, Columbus, OH;Universidade Federal de Minas Gerais, Belo Horizonte, MG - Brazil;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH;Universidade Federal de Minas Gerais, Belo Horizonte, MG - Brazil;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we look at the application of XML data management support in scientific data analysis workflows. We describe a software infrastructure that aims to address issues associated with metadata management, data storage and management, and execution of data analysis workflows on distributed storage and compute platforms. This system couples a distributed, filter-stream based dataflow engine with a distributed XML-based data and metadata management system. We present experimental results from a biomedical image analysis use case that involves processing of digitized microscopy images for feature segmentation.