Metabolic Flux Analysis in the Cloud

  • Authors:
  • Tolga Dalman;Tim Doernemann;Ernst Juhnke;Michael Weitzel;Matthew Smith;Wolfgang Wiechert;Katharina Noh;Bernd Freisleben

  • Affiliations:
  • -;-;-;-;-;-;-;-

  • Venue:
  • ESCIENCE '10 Proceedings of the 2010 IEEE Sixth International Conference on e-Science
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The MapReduce pattern popularized by Google has successfully been utilized in several scientific applications. In this paper, it is investigated whether a MapReduce approach utilizing on-demand resources from a Cloud is beneficial to perform simulation tasks in the area of Systems Biology and whether it can be seamlessly integrated into a service-oriented scientific workflow framework. In particular, an Amazon Elastic Map Reduce Cloud implementation of the 13C-MFA (Metabolix Flux Analysis) Monte Carlo bootstrap approach aimed at the integration into an existing BPEL-based scientific workflow system is presented. A comparison of a 64 node MapReduce cluster with a single node computation approach reveals a total performance gain up to a factor of 14, with a total cost for on-demand resources of $11. The most critical factor in terms of performance is I/O, i.e. our application suffers from the fact that I/O operations on many small files are expensive using Amazon S3 and the Hadoop DFS.