The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A dynamic load balancing strategy for parallel datacube computation
Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP
Efficient resumption of interrupted warehouse loads
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
GeMDA: A Multidimensional Data Partitioning Technique for Multiprocessor Database Systems
Distributed and Parallel Databases
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Parallel Star Join + DataIndexes: Efficient Query Processing in Data Warehouses and OLAP
IEEE Transactions on Knowledge and Data Engineering
Selective Materialization: An Efficient Method for Spatial Data Cube Construction
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Dynamic Query Scheduling in Parallel Data Warehouses
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Efficient OLAP Operations in Spatial Data Warehouses
SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Parallel Multi-Dimensional ROLAP Indexing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Parallel ROLAP Data Cube Construction On Shared-Nothing Multiprocessors
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Spatial hierarchy and OLAP-favored search in spatial data warehouse
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Range Aggregate Processing in Spatial Databases
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
In this paper we present a Parallel Spatial Data Warehouse (PSDW) system that we use for aggregation and analysis of huge amounts of spatial data. The data is generated by utilities meters communicating via radio. The PSDW system is based on a data model called the cascaded star model. In order to provide satisfactory interactivity for PSDW system, we used parallel computing supported by a special indexing structure called an aggregation tree. The balancing of a PSDW system workload is very essential to ensure the minimal response time of tasks submitted to process. We have implemented two data partitioning schemes which use Hilbert and Peano curves for space ordering. The presented balancing algorithm iteratively calculates optimal size of partitions, which are loaded into each node, by executing a series of aggregations on a test data set. We provide a collection of system tests results and its analysis that confirm the possibility of a balancing algorithm realization in proposed way. During ETL process (Extraction, Transformation and Loading) large amounts of data are transformed and loaded to PSDW. ETL processes are sometimes interrupted by occurrence of a failure. In such a case, one of the interrupted extraction resumption algorithms is usually used. In this paper we analyze the influence of the data balancing used in PSDW on the extraction and resumption processes efficiency.