Parallel Real-Time OLAP on Multi-core Processors
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Efficient distributed parallel top-down computation of ROLAP data cube using mapreduce
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
Closed cubing is a very efficient algorithm for data cube compression proposed recently in the literature. It losslessly condenses a group of cells into one cell if these cells have the same aggregate value and preserve roll-up/drill-down semantics. Despite its importance, parallel closed cubing solutions for huge data sets are not well studied so far to the best of the authors’ knowledge. This paper presents a parallel closed cube construction and query algorithm over low cost PC clusters using the MapReduce framework. In addition, we proved that with the number of data blocks increases, the closed cubes’ storage size decreases gradually. Thus users can specify the number of data blocks to balance the performance between cubes storage and query time. Experimental study demonstrates that our algorithm is efficient and scalable.