Multiobjective query optimization
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mariposa: a wide-area distributed database system
The VLDB Journal — The International Journal on Very Large Data Bases
Schedule optimization for data processing flows on the cloud
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Profit-driven scheduling for cloud services with data access awareness
Journal of Parallel and Distributed Computing
Optimizing analytic data flows for multiple execution engines
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Revenue Maximization Using Adaptive Resource Provisioning in Cloud Computing Environments
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Hi-index | 0.00 |
As cloud-based solutions have become one of the main choices for intensive data analysis both for business decision making and scientific purposes, users face the problem of choosing among different cloud providers. In this work, we deal with data analysis flows that can be split in stages, and each stage can run on multiple cloud infrastructures. For each stage, a cloud provider may make a bid in the form of a continuous function in the time delay-monetary cost domain. The goal is to compute the optimal combination of bids according to how much a user is prepared to pay for the total time delay to execute the analysis task. The contributions of this work are (i) to provide a solution that can be computed in pseudo-polynomial time and with bounded relative error for the generic case; (ii) to provide exact polynomial solutions for specific cases; and (iii) to experimentally evaluate our proposal against other techniques. Our extensive results show that we can yield improvements up to an order of magnitude compared to existing heuristics.