Experimental evidence on partitioning in parallel data warehouses
Proceedings of the 7th ACM international workshop on Data warehousing and OLAP
Efficiently Processing Query-Intensive Databases over a Non-Dedicated Local Network
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Hierarchical aggregation in networked data management
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
We present an efficient algorithm for processing distributed queries with the existence of partition dependency. For a given query, the algorithm first partitions the referenced relations into a number of non-exclusive subsets such that the join operation(s) associated with the relations in the subset can be locally processed without data transfer. Each subset is associated with a set of processing sites and can be used to generate an execution plan for the given query. Then, the algorithm determines a set of referenced fragmented relations not in the subset such that only the fragments instead of the whole relation need to be replicated at the processing sites. The other referenced relations are duplicated at each of the processing sites. Among the alternatives, the algorithm picks the plan that gives the minimum response time for the query. The experimental results show that our algorithm improves the performance of distributed query processing significantly.