Massively parallel data analysis with PACTs on Nephele

  • Authors:
  • Alexander Alexandrov;Max Heimel;Volker Markl;Dominic Battré;Fabian Hueske;Erik Nijkamp;Stephan Ewen;Odej Kao;Daniel Warneke

  • Affiliations:
  • Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale data analysis applications require processing and analyzing of Terabytes or even Petabytes of data, particularly in the areas of web analysis or scientific data management. This trend has been discussed as "web-scale data management" in a panel at VLDB 2009. Formerly, parallel data processing was the domain of parallel database systems. Today, novel requirements like scaling out to thousands of machines, improved fault-tolerance, and schema free processing have made a case for new approaches.