GeMDA: A Multidimensional Data Partitioning Technique for Multiprocessor Database Systems
Distributed and Parallel Databases
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Selective Materialization: An Efficient Method for Spatial Data Cube Construction
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Parallel Multi-Dimensional ROLAP Indexing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Hi-index | 0.00 |
One of the key requirements of data warehouses is query response time. Amongst all methods of improving query performance, parallel processing (especially in shared nothing class) is one of the giving practically unlimited system's scaling possibility. The complexity of data warehouse systems is very high with respect to system structure, data model and many mechanisms used, which have a strong influence on the overall performance. The main problem in a parallel data warehouse balancing is data allocation between system nodes. The problem is growing when nodes have different computational characteristics. In this paper we present an algorithm of balancing distributed data warehouse built on shared nothing architecture. Balancing is realized by iterative setting dataset size stored in each node. We employ some well known data allocation schemes using space filling curves: Hilbert and Peano. We provide a collection of system tests results and its analysis that confirm the possibility of a balancing algorithm realization in a proposed way.