SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Optimizing equijoin queries in distributed databases where relations are hash partitioned
ACM Transactions on Database Systems (TODS)
Accurate modeling of the hybrid hash join algorithm
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A Hash Partition Strategy for Distributed Query Processing
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Performance Measurements of Compressed Bitmap Indices
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Join algorithm costs revisited
The VLDB Journal — The International Journal on Very Large Data Bases
Time-Stratified Sampling for Approximate Answers to Aggregate Queries
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Denormalization Effects on Performance of RDBMS
HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 3 - Volume 3
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient, Chunk-Replicated Node Partitioned Data Warehouses
ISPA '08 Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications
Optimizing the data warehouse design by hierarchical denormalizing
ACS'08 Proceedings of the 8th conference on Applied computer scince
Constant-Time Query Processing
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Double Index NEsted-Loop Reactive Join for Result Rate Optimization
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
ONE: a predictable and scalable DW model
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Hi-index | 0.00 |
OLAP analysis is a fundamental tool for enterprises in competitive markets. While known (planned) queries can be tuned to provide fast answers, ad-hoc queries have to process huge volumes of the base DW data and thus resulting in slower response times. While parallel architectures can provide improved performance, by using a divide-and-conquer approach, their structure is rigid and suffers from scalability limitations imposed by the star schema model used in most deployments. Therefore usually they are over-dimensioned with computational resources in order to provide fast response times. However, for most business decisions, it is more important to have guarantees that queries will be answered in a timely fashion. The star schema model physical representation introduces severe limitations to scalability and in the ability to provide timely execution, due to the well-known parallel join issue and the need to use solutions such as on-the fly repartitioning of data or intermediate results, or massive replication of large data sets that still need to be joined locally. In this paper, we propose PH-ONE an architecture that overcomes the scalability limitations by combining an elastic set of inexpensive heterogeneous nodes with a denormalized DW storage model organization, which requires a minimal set of predictable processing tasks, using in a shared-nothing scheme to remove costly joins. PH-ONE delivers timely execution guarantees by adjusting the number of processing nodes and by rebalancing the data load according to the nodes characteristics. We used the TPC-H benchmark to evaluate PH-ONE ability to provide timely results.