Building cubes with MapReduce

  • Authors:
  • Alberto Abelló;Jaume Ferrarons;Oscar Romero

  • Affiliations:
  • Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain;Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain;Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain

  • Venue:
  • Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last years, the problems of using generic storage techniques for very specific applications has been detected and outlined. Thus, some alternatives to relational DBMSs (e.g., BigTable) are blooming. On the other hand, cloud computing is already a reality that helps to save money by eliminating the hardware as well as software fixed costs and just pay per use. Indeed, specific software tools to exploit a cloud are also here. The trend in this case is toward using tools based on the MapReduce paradigm developed by Google. In this paper, we explore the possibility of having data in a cloud by using BigTable to store the corporate historical data and MapReduce as an agile mechanism to deploy cubes in ad-hoc Data Marts. Our main contribution is the comparison of three different approaches to retrieve data cubes from BigTable by means of MapReduce and the definition of criteria to choose among them.