Database Size Estimation by Query Performance -- A Complexity Aspect

  • Authors:
  • Ye Zhou;Chi-Hung Chi

  • Affiliations:
  • -;-

  • Venue:
  • UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many techniques have been proposed to database size estimation. However, the emergency of cloud computing introduces new opportunities along with new challenges. In cloud, a monitoring proxy can be set up by service provider due to the ownership of cloud infrastructure. The collected data allows for service provider to estimate the size of database which may be a black-box to them. We claim that the relationship between query performance and data size can be captured by a complexity function. One can leverage such function to estimate table size if given query execution time. In this paper, we propose a fine grained framework called Database Size Estimation based on Complexity (DSEC) to estimate the size of databases from the perspective of service provider. In particular, we argue that only a small fraction of tables impact service performance significantly, which are referred to as "important tables". We illustrate "important table" locating process on three typical benchmarks: RUBiS, RUBBoS and TPC-W. Finally, we describe extensive experiments on TPC-W (the most challenging one) to evaluate the effectiveness and efficiency of DSEC in various scenarios.