The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Experimental Evaluation of a New Distributed Partitioning Technique for Data Warehouses
IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Vertical fragmentation of XML data warehouses using frequent path sets
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Avatara: OLAP for web-scale analytics products
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The DWS (Data Warehouse Striping) technique is a data partitioning approach especially designed for distributed data warehousing environments. In DWS the fact tables are distributed by an arbitrary number of low-cost computers and the queries are executed in parallel by all the computers, guarantying a nearly optimal speed up and scale up. Data loading in data warehouses is typically a heavy process that gets even more complex when considering distributed environments. Data partitioning brings the need for new loading algorithms that conciliate a balanced distribution of data among nodes with an efficient data allocation (vital to achieve low and uniform response times and, consequently, high performance during the execution of queries). This paper evaluates several alternative algorithms and proposes a generic approach for the evaluation of data distribution algorithms in the context of DWS. The experimental results show that the effective loading of the nodes in a DWS system must consider complementary effects, minimizing the number of distinct keys of any large dimension in the fact tables in each node, as well as splitting correlated rows among the nodes.