EvolvingSpace: A Data Centric Framework for Integrating Bioinformatics Applications

  • Authors:
  • Chen Wang;Bing Bing Zhou;Albert Y. Zomaya

  • Affiliations:
  • CSIRO ICT Center, Australia;The University of Sydney, Sydney;The University of Sydney, Sydney

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2010

Quantified Score

Hi-index 14.98

Visualization

Abstract

The paper presents EvolvingSpace, a data centric distributed system, which is intended to address the data and application integration problem in bioinformatics data centers. The system employs commodity PCs for data storage and computation. EvolvingSpace manages data in a decentralized manner, which is convenient for storing data annotations and can eliminate potential data-access bottlenecks. It indexes distributed data in multilevels to facilitate the construction of complex workflows that consist of applications running on different types of data. In addition, the paper proposes a data locality and workflow aware scheduling algorithm (ES-Scheduling) to balance the data distribution and computing performance as well as throughput and workflow response time. We run extensive experiments using the system with real bioinformatics applications. Our results show that the system is efficient for running integrated bioinformatics applications and has good scalability.