Data intensive applications on clouds

  • Authors:
  • Geoffrey C. Fox

  • Affiliations:
  • Indiana University, Bloomington, IN, USA

  • Venue:
  • Proceedings of the second international workshop on Data intensive computing in the clouds
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The cyberinfrastructure supporting science appears will include large-scale simulation systems headed to exascale combined with cloud like systems supporting data intensive and high throughput computing, pleasingly parallel jobs and the long tail of science. Clouds offer economies of scale, elasticity supporting real time and interactive use and powerful new programming models such as MapReduce. We stress that iterative extensions of MapReduce will be necessary to get good performance on for several data mining (analytics) applications. We give several illustrations mainly from bioinformatics. We suggest that the data deluge implies a corresponding increase in the computational resources needed to support analysis and this suggests new architectures for large scale data repositories.