A predictable storage model for scalable parallel DW

  • Authors:
  • João Pedro Costa;José Cecilio;Pedro Martins;Pedro Furtado

  • Affiliations:
  • ISEC-IPC, Rua Pedro Nunes, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal

  • Venue:
  • Proceedings of the 15th Symposium on International Database Engineering & Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Star schema model, has been widely used as the facto DW storage organization on RDBMS. Business measures are stored in a central fact table along with a set of foreign keys referencing dimension tables. While this storage organization offers a good trade-off between storage size and performance for a single node, it doesn't scale in a predictable manner in shared-nothing parallel architectures. Although fact tables can be linearly partitioned among nodes, the same doesn't apply to dimensions, which unbalances (increases) the dimensions/fact_table size ratio, and consequently introduces limits to the number of parallel nodes. In this paper we propose and evaluate a parallel DW storage model, that overcomes these limitations and deliver optimal speed-up and scale-up capabilities with top efficiency. We use the TPC-H benchmark to evaluate the scalability and efficiency of the proposed model.