Building the data warehouse (2nd ed.)
Building the data warehouse (2nd ed.)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
A break in the clouds: towards a cloud definition
ACM SIGCOMM Computer Communication Review
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
Communications of the ACM
How to Enhance Cloud Architectures to Enable Cross-Federation
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
An analysis of Linux scalability to many cores
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Dynamically scaling applications in the cloud
ACM SIGCOMM Computer Communication Review
A case for scaling applications to many-core with OS clustering
Proceedings of the sixth conference on Computer systems
State of the Practice Reports
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
As the volume of data to be processed in a timely manner soars, the scale of computing and storage systems has much trouble keeping up with such a rate of explosive data growth. A hybrid cloud combining two or more clouds is emerging as an appealing alternative to expand local/private systems. However, the effective use of such an expanded cloud system is limited primarily by low network bandwidth and high latency between clouds (i.e., large intercloud data transmission overheads) when applications/services span across clouds, and they deal with large data in particular. Online analytical processing (OLAP) applications are a typical class of data-intensive application. These applications process multi-dimensional analytical queries dealing with 'big data' (or data warehouses). In this paper, we address the effective processing of MapReduce-based OLAP applications in a hybrid-cloud environment, and present a (hybrid) cloud-aware OLAP system incorporating data filtering techniques. Our system filters out unnecessary data for intercloud transmission with the ultimate goal of optimizing the performance to cost ratio, or cost efficiency. Based on experimental results obtained using two large-scale data analysis benchmarks, our system demonstrates its efficacy in improving the cost efficiency with the reduction in intercloud network traffic from 76%-99%.