TEEPA: a timely-aware elastic parallel architecture

Authors:
João Pedro Costa;Pedro Martins;José Cecilio;Pedro Furtado
Affiliations:
ISEC- Polytechnic Institute of Coimbra, Rua Pedro Nunes, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal;University of Coimbra, Coimbra, Portugal
Venue:
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Year:
2012

Citing 16
Cited 1

Optimizing equijoin queries in distributed databases where relations are hash partitioned

ACM Transactions on Database Systems (TODS)
Accurate modeling of the hybrid hash join algorithm

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
A Hash Partition Strategy for Distributed Query Processing

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Performance Measurements of Compressed Bitmap Indices

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Join algorithm costs revisited

The VLDB Journal — The International Journal on Very Large Data Bases
Time-Stratified Sampling for Approximate Answers to Aggregate Queries

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
A comparison of approaches to large-scale data analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads

Proceedings of the VLDB Endowment
HadoopDB in action: building real world applications

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MOSS-DB: a hardware-aware OLAP database

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)

Proceedings of the VLDB Endowment
Efficient processing of data warehousing queries in a split execution environment

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
ONE: a predictable and scalable DW model

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
A predictable storage model for scalable parallel DW

Proceedings of the 15th Symposium on International Database Engineering & Applications

Cloudy: heterogeneous middleware for in time queries processing

Proceedings of the 17th International Database Engineering & Applications Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel Shared-Nothing architectures are frequently used to handle large star-schema Data Warehouses (DW). The continuous increase in data volume and the star-schema storage organization introduce severe limitations to scalability due to the well-known parallel join issues and the resulting need to use solutions such as on-the fly repartitioning of data or intermediate results, or massive replication of large data sets that still need to be joined locally, constraining their ability to deliver fast results. Parallelism may improve query performance, however some business decisions may require that query results be timely available which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a Timely-aware Execution Parallel Architecture (TEEPA) which balances data load and query processing among an elastic set of non-dedicated heterogeneous nodes in order to provide scale-out performance and timely query results. Data is allocated using adaptable storage models to minimize join costs (the major uncertainty factor) which best fit the nodes' capabilities, while preserving a consistent logical view of the star-schema. We present experimental evaluation of TEEPA and demonstrate its ability to provide timely results.