Large relations in node-partitioned data warehouses

Authors:
Pedro Furtado
Affiliations:
DEI /CISUC, Universidade de Coimbra, Portugal
Venue:
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Year:
2005

Citing 4
Cited 1

Algorithms to Process Distributed Queries in Fast Local Networks

IEEE Transactions on Computers
Optimizing equijoin queries in distributed databases where relations are hash partitioned

ACM Transactions on Database Systems (TODS)
Automating physical database design in a parallel database

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Distributed Query Processing Strategy Using Placement Dependency

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering

A predictable storage model for scalable parallel DW

Proceedings of the 15th Symposium on International Database Engineering & Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

A cheap shared-nothing context can be used to provide significant speedup on large data warehouses, but partitioning and placement decisions are important in such systems as repartitioning requirements can result in much less-than-linear speedup. This problem can be minimized if query workload and schemas are inputs to placement decisions. In this paper we analyze the problem of handling large relations in a node partitioned data warehouse (NPDW) with a basic placement strategy that partitions facts horizontally and replicates dimensions, with the help of a cost model. Then we propose a strategy to improve performance and show both analytical and TPC-H results.