A MapReduceMerge-based Data Cube Construction Method

Authors:
Yuxiang Wang;Aibo Song;Junzhou Luo
Affiliations:
-;-;-
Venue:
GCC '10 Proceedings of the 2010 Ninth International Conference on Grid and Cloud Computing
Year:
2010

Citing 0
Cited 1

Efficient distributed parallel top-down computation of ROLAP data cube using mapreduce

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

The pre-computation of data cubes is critical to improve the response time of On-Line Analytical Processing (OLAP) system. However, as the size of data grows, the time it takes to construct data cubes becomes a significant performance bottleneck. Therefore, we need the parallel pre-computation approach to further improve the performance of OLAP. Current parallel approaches can be grouped into two categories: work partitioning and data partitioning. But the first one can not guarantee the load balance among processors and the second one produces massive data movement between processors. This paper proposes a MapReduceMerge-based parallel data cube construction method with a read-optimized data storage strategy which is more suitable for OLAP. Our method can ensure good load balancing and reduce the large amount of data movement compared with traditional approaches. MapReduceMerge is the expansion of Map Reduce which is a programming model that enables easy development of parallel applications to process massive data on large clusters and it is the key element of Hadoop(an cloud computing framework) which used to support the businesses of Face book under cloud environment. We modify the original MapReduceMerge framework to make it meet the needs of cuboids construction and show the implementation in details through an example of 2-dimension cuboids construction. In the mean time, we discuss the optimization for the construction of multi-dimension cuboids.