Cloud-aware processing of MapReduce-based OLAP applications

  • Authors:
  • Hyuck Han;Young Choon Lee;Seungmi Choi;Heon Y. Yeom;Albert Y. Zomaya

  • Affiliations:
  • Seoul National University, Seoul, Korea;University of Sydney, NSW, Australia;Seoul National University, Seoul, Korea;Seoul National University, Seoul, Korea;University of Sydney, NSW, Australia

  • Venue:
  • AusPDC '13 Proceedings of the Eleventh Australasian Symposium on Parallel and Distributed Computing - Volume 140
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the volume of data to be processed in a timely manner soars, the scale of computing and storage systems has much trouble keeping up with such a rate of explosive data growth. A hybrid cloud combining two or more clouds is emerging as an appealing alternative to expand local/private systems. However, the effective use of such an expanded cloud system is limited primarily by low network bandwidth and high latency between clouds (i.e., large intercloud data transmission overheads) when applications/services span across clouds, and they deal with large data in particular. Online analytical processing (OLAP) applications are a typical class of data-intensive application. These applications process multi-dimensional analytical queries dealing with 'big data' (or data warehouses). In this paper, we address the effective processing of MapReduce-based OLAP applications in a hybrid-cloud environment, and present a (hybrid) cloud-aware OLAP system incorporating data filtering techniques. Our system filters out unnecessary data for intercloud transmission with the ultimate goal of optimizing the performance to cost ratio, or cost efficiency. Based on experimental results obtained using two large-scale data analysis benchmarks, our system demonstrates its efficacy in improving the cost efficiency with the reduction in intercloud network traffic from 76%-99%.